LauRa

Entries from May 2009

QUESTIONARIE 3: Machine Traslation

May 31, 2009 · Leave a Comment

Machine traslation (MT) accuracy has recently increased, due to better techniques and to the availability of larger parallel training sets. Statical (MT) are now able to traslate across a wide variety of language pairs. This article covers the basic elements of state-of-the-art, statical MT, including modeling, decoding, evaluation, and data preparation.

There are many aproaches to the machine traslation of Human Languages. Some aproaches requiere manual knowleadge ntry by highly skilled linguistics while others make a use of automatic training procedures. Some aproaches make use of abstract meaning representations, while other work at the level of word sustitutions. Many combinatins oh these dimensions have been explored – manual entry of large dictionaries, automatic learning of phrase substitution tables, semi-automatic construction of syntactic-transformation rules etc.

We can find some machine translators on the Internet:

REFERENCES:

*Computational Linguistics and Machine Traslation Research and Developement. Retrieved 30May 2009, 11:12 from http://www.fti.uab.es/tradumatica/revista/num4/articles/06/06central.htm

*Kevin Knight and Daniel Marcu. Retrieved 30 May 2009, 11:12 from  http://www.isi.edu/natural-language/mt/icassp05.pdf

 

Categories: HLT · IST · Littera

QUESTIONARIE 2: The Intelligent Library Assistant (DiLiA)

May 25, 2009 · Leave a Comment

According to the Language Technology Lab ”the focus of the project is research and development for novel access to information in digital libraries. Digital libraries store information from diverse areas, e.g. scientific articles, and offer a web-based access to these digital formats. The user currently can search via names, standardized key words or other text-based means.”

In DiLiA methods will be developed that extends this access by several dimensions. On the one hand, the user will be suppoerted and guided by allowing her to visually investigate relevant search result sets. On the other hand, methods will be developed to extract cross links and semantic relations from a document set selected by the user..

We assume that a DiLiA user is searching for literature within a defined but not precisely specified area. Neither does she know exactly about the coverage of the digital library, and how it is indexed internally. To narrow down the set of possibly relevant documents (books, articles, other media), an interactive search based on dialogues between the user and the system will be carried out opening up the contents of the documents.

The DiLiA project will realize the following innovative aspects that are combined in the given way for the first time:

  • Investigation, supply and adaptation of the contents of a real digital library for interactive information extraction
  • Hybrid information extraction based on a combination of metadata and document processing
  • Development of domain-adaptive deep methods for information extraction using the example of biomedical documents
  • Prototypical development of interactive personalized navigation allowing the user of the digital library an intuitive multimodal search.

REFERENCES:

*DFKILT (Language Technology Lab) DiLiA. Retrieved 25th May 2009, 13:15 from http://www.dfki.de/lt//project.php?id=Project_447&l=en

Categories: HLT · IST · Littera

QUESTIONARIE 2: Automatic Language Identification (ALI)

May 25, 2009 · Leave a Comment

Automatic Language Identification (ALI) is the problem of automatically identifying the language of an utterance through the use of a computer. In 1977, House and Neuburg proposed an approach to ALI which focused on the phonotactic constraints of different languages. Their work suggested that simple language models could be used effectively for language identification if an accurate phonetic representation of an utterance could be obtained from the acoustic signal. Some research utilizes House and Neuburg’s ideas in this centres are investigating a starting point for a new segment-based approach to ALI. To develop a solid theoretical basis for the design of an ALI system, a formal probabilistic framework has been developed. This framework uses House and Neuburg’s ideas as its foundation but also utilizes additional information that may be useful for ALI. Specifically, phonotactic, acoustic and prosodic information are all incorporated into the framework which provides the structure for the segment-based system.

EXAMPLES:

  • A bilingual voyager system
  • Language identification using noisy speech
  • Language recognition test and evaluation
  • Automatic language ientification
  • An automatic language identification

REFERENCES:

*Citeseer. Retrieved 21th May 2009, 16:53 from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.2948

* Language traslation. Retrieved 21th May 2009,16:53 from http://www.translation-guide.com/language_identification.htm

Categories: HLT · IST · Littera

QUESTIONARIE 2: List of topics

May 25, 2009 · Leave a Comment

These are the 12 topics I have chosen:

  1. Multilingual Tourist Information on the World Wide Web (MIETTA)
  2. Intelligent Extraction of Information from On-line documents (PARADIME)
  3. The Intelligent Library Assistant (DiLiA)
  4. Controlled Semantic-based Question Answering (ConQA)
  5. Collaborating Using Diagrams (Study of how pairs collaborate when in planning a route on a map)
  6. Machine Learning for Named Entity Recognition (SEER)
  7. Answer Extraction
  8. Lexical-Functional Grammar
  9. Natural logic and interface
  10. Citizens Advanced Relationship Management
  11. Text-to-speech Synthesis
  12. Automatic Language Identification

RESOURCES:

*Projects. In Language technology world. Retrieved 21th May 2009, 16:22 from: http://www.lt-world.org/

*Language Technology Lab, current projects. Retrieved: 21th May 2009, 16:24. From http://www.dfki.de/lt/current_projects.php

*Language Technology Lab, completed projects. Retrieved: 21th May 2009 16:25. From http://www.dfki.de/lt/completed_projects.php

Categories: HLT · IST · Littera

QUESTIONARIE 1: Human Language Technologies (Research centres)

May 10, 2009 · Leave a Comment

“Human Language Technology (HLT) makes it easier for people to interact with machines. This can benefit a wide range of people – from illiterate farmers in remote villages who want to obtain relevant medical information over a cellphone, to scientists in state-of-the-art laboratories who want to focus on problem-solving with computers.”

Human Language Technology studies some different areas;

Multimodal Interaction
Technologies to deal with a recent paradigm shift in the design of Pattern Recognition, where the traditional concept of full-automation is being changed to systems where the decision process is conditioned by human feedback. Problems and applications considered within this area include: Relevance-based (image) information retrieval and Interactive-Predictive processing for Computer Assited Machine Translation, as well as for the Interactive Transcription of speech audio streems and text images.
Machine Translation
Speech-to-speech translation or text-to-text translation for limited domains. Finite-state and statistical transducers are used as the basis of the machine translation systems. These models can be learnt automatically from real examples of translation. Applications: translation of technical reports, hotel services, etc.
Handwritten Cursive Text Recognition (HTR)
Both off-line (document images) and on-line HTR (tablet or e-pen signals) are considered. No prior character or word segmentation is needed. Technology, borrowed from Speech Recognition, relies on character Hidden Markov Models, Finite State word models, and syntactic N-Grams. After model training, for each given text line image, a holistic (“Viterbi”) search provides both an optimal transcription and the corresponding word and character segmentations. Applications: Transcription of ancient and legacy documents, transcription of unconstrained handwritten text in survey forms, etc.
Automatic Speech Recognition and Understanding
The speech utterances are decoded into strings of words or into strings of semantic units. Finite-state grammars are used as the basis of such systems. These finite-state grammars are learnt automatically from real examples of utterances or text. Applications: telephone exchange services, device control by voice, information queries, etc.
Image Analysis and Computer Vision
Identification of the objects in an image. Statistical and Syntactic Pattern recognition techniques are used. Applications: OCR and document analysis, medical diagnosis, fingerprint identification, classification of chromosomes, aids for the handicapped, manufacturing quality control, etc.

REFERENCES:

 *Meraka Institute. Retrieved 25th May 2009, 12:43 from: http://www.meraka.org.za/humanLanguage.htm

*Pattern recognition and Human Language Technologies. Retrieved 25th May 2009, 12:44 from: http://prhlt.iti.es/content.php?page=areas.php

Categories: HLT · IST · Littera
Tagged:

QUESTIONARIE 1: Human Language technologies

May 10, 2009 · Leave a Comment

According to Wikipedia HLT) or natural language processing  (NLP) and consists of computational linguistics (or CL) and speech technology as its core but includes also many application oriented aspects of them. Language technology is closely connected to computer science and general linguistics.

Human language technologies promise to both connect human’s access to information as well as to allow their interaction with one another such as by increasing their awareness of knowledge artifacts or activities intersecting their interests. Key elements include cataloguing existing knowledge, discovering expertise, and creation of new knowledge.

References:

*Human Language Technology. Wikipedia, The Free Encyclopedia. Retrieved 10 May 2009 12:49 from http://en.wikipedia.org/w/index.php?title=Language_technology&oldid=202607020

*Survey of the State of the Art in Human Language Technology (1997) In German Research Center of Artificial Intelligence. Retrieved 12:50, May 10 2009. From http://www.dfki.de/~hansu/HLT-Survey.pdf

Categories: HLT · IST · Littera
Tagged: