To complete the review, two models that combine the HMM with another dynamical model that is not an SSM are now presented.
The hidden Markov ICA algorithm by Penny et al.  is an interesting variant of the switching SSMs. It has both switching dynamics and observation mapping, but the dynamics are not modelled with a linear dynamical system. Instead, the observations are transformed to independent components which are then each predicted separately with a generalised autoregressive (GAR) model. This allows using samples from further back for the prediction but everything is done strictly separately for all the components.
The MLP/HMM hybrid by Chung and Un  uses an MLP network for nonlinear prediction and an HMM to model its prediction errors. The predictor operates in the observation space so the model is not an NSSM. The prediction errors are modeled with a mixture-of-Gaussians for each HMM state. The model is trained using maximum likelihood and a discriminative criterion for minimal classification error in a speech recognition application.