PHP Classes
elePHPant
Icontem

PHP Text Language Detection Library: Detect the language of a given text string

Recommend this page to a friend!
  Info   View files Documentation   View files View files (88)   DownloadInstall with Composer Download .zip   Reputation   Support forum   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2016-12-20 (13 days ago) RSS 2.0 feedNot yet rated by the usersTotal: Not yet counted Not yet ranked
Version License PHP version Categories
language-detection 1.3Custom (specified...7.1Localization, Algorithms, Text proces..., A..., P...
Collaborate with this project Author

language_detection - github.com

Description

This package can detect the language of a given text string.

It can parse given training text in many different idioms into a sequence of n-gram items and builds a database file in JSON format to be used in the detection phase.

The package can then take a given text and detect its language using the database previously generated in the training phase.

The package comes with text samples used for training and detecting text in 73 languages.

  Performance   Level  

Details

language_detection

Build Status Version Total Downloads Maintenance License

Detect the language from a given text. To do that it generates a language profile based on N-grams for every file in etc directory. Then it generate such language profile for the unknown text and compare the previosly language profiles against the unknown.

Requirements:

Only requirement is a PHP version greater than or equal to 7.1. > Note: language_detection requires the Multibyte String extension in order to work.

Install via Composer

composer require patrick-schur/language-detection

Or add the following to composer.json

{
  "require": {
     "patrick-schur/language-detection": "*"
  }
}

Basic Usage

Before we can recognize the language from a given text, we have to generate a language profile for each language. From the beginning it comes with a pre-trained language profile (etc/_langs.json).<br> Also you can add new files to etc or change existing ones.

First we have to generate a language profile.

require_once 'vendor/autoload.php';
 
use LanguageDetector\Trainer;
 
$t = new Trainer;
 
$t->learn();

If we have our language profile, we can classify texts by their language. To detect the language correctly, the length of the input text should be at least some sentences.

require_once 'vendor/autoload.php';
 
use LanguageDetector\LanguageDetector;
 
$ld = new LanguageDetector;
 
var_dump($ld->detect('Das ist ein deutscher Satz.')); // de

Supported languages:

It supports up to now 73 languages. If your language not supported, feel free to add your own language files.

  • ab (abkhaz)
  • af (afrikaans)
  • am (amharic)
  • ar (arabic)
  • az (azerbaijani)
  • be (belarusian)
  • bg (bulgarian)
  • bn (bengali)
  • co (corsican)
  • cs (czech)
  • cy (welsh)
  • de (german)
  • dk (danish)
  • el (greek)
  • en (english)
  • eo (esperanto)
  • es (spanish)
  • et (estonian)
  • eu (basque)
  • fa (persian)
  • fi (finnish)
  • fj (fijian)
  • fo (faroese)
  • fr (french)
  • ga (irish)
  • gd (scottish)
  • gl (galician)
  • gn (guarani)
  • ha (hausa)
  • he (hebrew)
  • hi (hindi)
  • hr (croatian)
  • hu (hungarian)
  • hy (armenian)
  • ia (interlingua)
  • ig (igbo)
  • io (ido)
  • is (icelandic)
  • it (italian)
  • iu (inuktitut)
  • jp (japanese)
  • jv (javanese)
  • ka (georgian)
  • ko (korean)
  • ku (kurdish)
  • la (latin)
  • lg (ganda)
  • lo (lao)
  • lt (lithuanian)
  • lv (latvian)
  • mh (marshallese)
  • mn (mongolian)
  • ms (malay)
  • mt (maltese)
  • nl (dutch)
  • no (norwegian)
  • nv (navajo)
  • pl (polish)
  • pt (portuguese)
  • ro (romanian)
  • ru (russian)
  • sk (slovak)
  • sl (slovene)
  • so (somali)
  • sv (swedish)
  • th (thai)
  • tr (turkish)
  • ty (tahitian)
  • ug (uyghur)
  • uk (ukrainian)
  • uz (uzbek)
  • vi (vietnamese)
  • zh (chinese)
  Files folder image Files  
File Role Description
Files folder imageetc (74 files)
Files folder imagesrc (1 directory)
Files folder imagetests (3 files)
Accessible without login Plain text file .travis.yml Data Auxiliary data
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE.md Lic. License text
Accessible without login Plain text file phpunit.xml Data Auxiliary data
Accessible without login Plain text file README.md Doc. Documentation

 Version Control Unique User Downloads  
 100%
Total:0
This week:0