Bookmark
Personal names around the world
www.w3.org/International/questions/qa-personal-names.en.php?changelang=en, posted 2014 by peter in culture language reference
People who create web forms, databases, or ontologies are often unaware how different people’s names can be in other countries. They build their forms or databases in a way that assumes too much on the part of foreign users. This article will first introduce you to some of the different styles used for personal names, and then some of the possible implications for handling those on the Web.
Bookmark
Find kanji by radicals - Denshi Jisho
jisho.org/kanji/radicals, posted 2014 by peter in japan language online reference
Click on the parts that are in the kanji you are looking for. You can click on them again to de-select them.
Amongst the thousands of languages spoken across the world, here are just eighty. How many can you distinguish between?
Bookmark
Who speaks Latin these days?
m.bbc.co.uk/news/magazine-21412604, posted 2013 by peter in history language religion
Nicholas Ostler, author of Ad Infinitum, a history of Latin, and the Chairman of the Foundation for Endangered Languages, compares Latin's presence on the internet (interretialis) to a small European language - it is comparable to "Icelandic, Lithuanian or Slovenian". § Ostler emails his brother in Latin for fun and enthusiasts maintain websites such as Circulus Latinus Interretialis (Internet Latin Circle), Grex Latine Loquentium (Flock of those Speaking Latin) and the connected online paper Ephemeris. The Finnish radio station YLE even broadcasts news in Latin.
Bookmark
TextBlob: Simplified Text Processing — TextBlob 0.5.0 documentation
https://textblob.readthedocs.org/en/latest/, posted 2013 by peter in development free language nlp python software toread
TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, translation, and more.
Bookmark
An Efficient Way to Extract the Main Topics from a Sentence | The Tokenizer
thetokenizer.com/2013/05/09/efficient-way-to-extract-the-main-topics-of-a-sentence/, posted 2013 by peter in language nlp python toread
Last week, while working on new features for our product, I had to find a quick and efficient way to extract the main topics/objects from a sentence. Since I’m using Python, I initially thought that it’s going to be a very easy task to achieve with NLTK. However, when I tried its default tools (POS tagger, Parser…), I indeed got quite accurate results, but performance was pretty bad. So I had to find a better way. Like I did in my previous post, I’ll start with the bottom line – Here you can find my code for extracting the main topics/noun phrases from a given sentence. It works fine with real sentences (from a blog/news article). It’s a bit less accurate compared to the default NLTK tools, but it works much faster!
Bookmark
translate.google.com/toolkit, posted 2013 by peter in conversion free language nlp online
Google Translator Toolkit is a powerful and easy-to-use editor that helps translators work faster and better.
Bookmark
Which Unicode characters can you depend on? | The Endeavour
www.johndcook.com/blog/2013/04/11/which-unicode-characters-can-you-depend-on/, posted 2013 by peter in language typography unicode webdesign
So what characters can you count on nearly everyone being able to see? To answer this question, I looked at the characters in the intersection of several common fonts: Verdana, Georgia, Times New Roman, Arial, Courier New, and Droid Sans. My thought was that this would make a very conservative set of characters. There are 585 characters supported by all the fonts listed above. Most of the characters with code points up to U+01FF are included. This range includes the code blocks for Basic Latin, Latin-1 Supplement, Latin Extended-A, and some of Latin Extended-B. The rest of the characters in the intersection are Greek and Cyrillic letters and a few scattered symbols. Flat, natural, sharp, and gradient didn’t make the cut.
Bookmark
'Ogooglebar' ... and 14 Other Swedish Words We Should Incorporate Into English Immediately - Megan Garber - The Atlantic
www.theatlantic.com/technology/archive/2013/03/ogooglebar-and-14-other-swedish-words-we-should-incorporate-into-english-immediately/274383/, posted 2013 by peter in humor language list opinion sweden
Swedish, adding to all the awesomeness, has proven especially adept at coining new words for the new circumstances occasioned by new technologies. Below, some of the best Swedologisms I could find, via the Swedish news site The Local. We should, obviously, incorporate them into English as soon as possible.
Bookmark
Delver - a natural language interface to your app
delver.io/, posted 2013 by peter in development language nlp software toread
Down in the depths of your organisation, you have a treasure-trove of valuable data. But how hard is it for your users to retrieve it? Salvage your data with a natural language interface - ask your app English questions, get clear answers and reports back.
|< First < Previous 41–50 (137) Next > Last >|