Most mutual info (MMI) has grow to be one of many two de facto strategies for sequence-level coaching of speech recognition acoustic fashions. This paper goals to isolate, establish and produce ahead the implicit modelling choices induced by the design implementation of ordinary finite state transducer (FST) lattice based mostly MMI coaching framework. The paper notably investigates the need to keep up a preselected numerator alignment and raises the significance of determinizing FST denominator lattices on the fly. The efficacy of using on the fly FST lattice determinization is mathematically proven to ensure discrimination on the speculation degree and is empirically proven by way of coaching deep CNN fashions on a 18K hours Mandarin dataset and on a 2.8K hours English dataset. On assistant and dictation duties, the strategy achieves between 2.3-4.6% relative WER discount (WERR) over the usual FST lattice based mostly strategy.