Novell is now a part of Micro Focus

Novell's Breakthrough Language Identifier Helps Lower Costs for Multilingual Businesses

Articles and Tips: article

01 Aug 1997

Companies that do business in more than one language can now automatically identify foreign-language documents to help lower the costs of their management. The Novell Collexion Language Identifer is the first commercial product that can identify the language of E-mail, word processing documents, Internet Web sites, and more. This product makes it easier for companies to quickly and efficiently connect people via their networks to the information they need do their jobs.

Language Identifier is the fastest and most accurate engine of its kind, correctly identifying 15 different languages on the basis of as few as three words. It speeds the process of filtering information to appropriate people and enhances the productivity of users who compose text in multiple languages.

Software developers can license Language Identifier to incorporate into products such as Web browsers, e-mail applications, word processors, and others. A demonstration is available on the World Wide Web at

"In the age of the Internet, businesses are rapidly increasing their international communication," said Rudy Montigny, vice president of linguistic development for Novell. "Employees frequently spend valuable time determining the language origin of a message and routing it to someone who can read it. If they write in multiple languages, they frequently need to change the settings in their desktop applications. With Language Identifier embedded into applications, companies will save both time and money by automatically and efficiently labeling and sorting text messages for users."

Language Identifier employs a patent-pending language recognition technology that processes and identifies input seven times faster than the fastest competitors while using a seven-times-smaller data set. It requires no dictionaries or other large data files. The product has a language recall of virtually 100% for input of 20 words or more and is also extremely precise.

Language Identifier has many uses. In a word processing application, it can automatically select the linguistic tools (spell checking, grammar checking, thesaurus, etc.) for the user, saving time and effort. It can also streamline document management systems by labeling documents with the source language for easy classification and retrieval, as well as determining whether translation is necessary before showing documents to the user.

In Internet and E-mail applications, Language Identifier can rank messages, query hits and attached documents according to the user's language preferences. This offers a valuable filter for any Internet/intranet application, especially Internet search engines and information retrieval systems.

"People often encounter useful documents on the Internet or receive electronic correspondence in languages they don't recognize," commented Dr. Giovanni Tata, president of Provo, Utah-based Transoft International, a developer of language translation software. "Language Identifier dissolves linguistic barriers to make communicating nearly transparent-users can route documents for translation or action without delays and confusion. With Language Identifier, our customers can easily and automatically determine which of our language translation modules to use for a specific document, regardless of its source."

Language Identifier currently recognizes 15 different languages: Danish, Dutch, English, Finnish, French, German, Greek, Indonesian, Italian, Norwegian, Portuguese, Russian, Spanish, Swedish and Turkish. Novell Collexion will add additional languages to accommodate customer requests.

Language Identifier is available in Java for complete cross-platform functionality as well as C++ for Microsoft Windows 95, Microsoft Windows NT, Apple Macintosh and NetWare Loadable Module (NLM). It supports Ami Pro files, HTML files, plain text files (ANSI, ASCII and OEM #437), Quattro Pro files, RTF files, Microsoft Word files and WordPerfect files.

File types that contain no text, such as GIF or fully-graphical Web pages, are easily recognized and dismissed. Pricing information is available through Novell's OEM sales program.

Novell Collexion encompasses a suite of information retrieval tools, publishing tools, writing tools and linguistic components developed by Novell's Advanced Technology Division (ATD), the worldwide leader in linguistic technology and information retrieval development.

ATD linguists and software engineers develop language-enabled technologies to make computing attractive and easy to use for the average user. ATD's solutions are not only an integral part of most of Novell's products, but are now available to third- party developers for integration into their applications.

* Originally published in Novell AppNotes


The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.

© Copyright Micro Focus or one of its affiliates