This Blogger site is just a holding place for our news!

To access the Latin dictionary, click this link:

Numen - The Latin Lexicon - An Online Latin Dictionary

Showing posts with label parsing engine. Show all posts
Showing posts with label parsing engine. Show all posts

Wednesday, October 14, 2009

J's and U's Updated / Speed Increases

I mentioned a few weeks ago that I planned on making I's/J's and U's/V's look the same on the back-end, while preserving their traditional orthographies on the front-end. I've just completed this task!

My main motivation for making this update is because certain passages stored in The Latin Library reflect the older conventions of using J's for consonantal I's or U's for both consonantal and vocalic V's. Numen's parsing engine was having trouble recognizing forms like jecit (iecit) and uuius (vivus). So now as a result -- after a bit of work -- the engine is updated and now recognizes more possibilities than ever. Incidentally, internally J's are stored as I's and U's are stored as V's.

Another project I completed at the same time is an order-of-magnitude speed improvement for parsing. I was trying to figure out ways to make the engine faster and I discovered a shortcut that boosts speed tremendously. When parsing a word, the engine used to spend between 250ms and 500ms parsing each word! That was always disappointing to me, but I had gotten around the problem by caching the results. Now, however, word parsing takes about 25ms!

Why bother improving the speed? Because soon I will be implementing word lists and frequency lists! A word list, of course, is just a "mini-lexicon" that defines only the words in your chosen passage, and a frequency list is a list of words in order of how often they appear in a passage. The word list will be helpful to quickly work on vocabulary for a passage, and a frequency list will help Latin students study more effectively by giving them the most frequent words first. I'm very excited about this feature, but I don't anticipate it will be done before January 10th (giving me the winter holiday to work on it).

That's all for now!

Saturday, February 7, 2009

Lucretian Updates

Based on the recent lack of news on this site, you might assume that the Latin Lexicon is dormant or stagnant. Yet nothing could be further from the truth! In fact there's been quite a bit of behind-the-scenes activity!

Let me start out by apologizing for not updating more often. This semester has turned out to be rather more packed with excitement than the last one. Since I've been short on time, I allowed blogging and site documentation slip.

In terms of back end coding, there hasn't been much activity. I've cleaned up a few bugs here and there. For instance, I cleaned up a UTF8 bug on the Word Study Tool.

But in more interesting news, I've been a busy beaver correcting words that appear in Lucretius' De Rerum Natura Book III. For the last 5 semesters our classes have focused on Augustan poets. Since the vocabulary is somewhat stock among those guys, I ended up doing very few corrections to the dictionary entries themselves. But since Lucretius uses a whole new set of vocabulary, the amount of "cleanup" is massive! This is a good thing, since words which contain errors get fixed and the Latin Lexicon slowly improves in quality.

Since the news/blog section of this site gives the first impression to new visitors, it might look bad if the front page isn't regularly updated. Good impressions are the best impressions, so I'll try to update more regularly despite the crazy-busy semester I'm having.

On that note, it's back to the grindstone for me. Valēte!

Update: To the parsing engine I added a couple of pronouns: quisquam and quidam, since Lucretius is so fond of them.

Video Tutorials