I mentioned a few weeks ago that I planned on making I's/J's and U's/V's look the same on the back-end, while preserving their traditional orthographies on the front-end. I've just completed this task!
My main motivation for making this update is because certain passages stored in The Latin Library reflect the older conventions of using J's for consonantal I's or U's for both consonantal and vocalic V's. Numen's parsing engine was having trouble recognizing forms like jecit (iecit) and uuius (vivus). So now as a result -- after a bit of work -- the engine is updated and now recognizes more possibilities than ever. Incidentally, internally J's are stored as I's and U's are stored as V's.
Another project I completed at the same time is an order-of-magnitude speed improvement for parsing. I was trying to figure out ways to make the engine faster and I discovered a shortcut that boosts speed tremendously. When parsing a word, the engine used to spend between 250ms and 500ms parsing each word! That was always disappointing to me, but I had gotten around the problem by caching the results. Now, however, word parsing takes about 25ms!
Why bother improving the speed? Because soon I will be implementing word lists and frequency lists! A word list, of course, is just a "mini-lexicon" that defines only the words in your chosen passage, and a frequency list is a list of words in order of how often they appear in a passage. The word list will be helpful to quickly work on vocabulary for a passage, and a frequency list will help Latin students study more effectively by giving them the most frequent words first. I'm very excited about this feature, but I don't anticipate it will be done before January 10th (giving me the winter holiday to work on it).
That's all for now!
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
These comments are now moderated. Spammers, go away. Real people, sorry about the spammers.