Menu
Home
Contact us
Stats
Categories
Calendar
Toggle Wiki
Wiki Home
Last Changes
Rankings
List pages
Orphan pages
Sandbox
Print
Toggle Image Galleries
Galleries
Rankings
Toggle Articles
Articles home
List articles
Rankings
Toggle Blogs
List blogs
Rankings
Toggle Forums
List forums
Rankings
Toggle File Galleries
List galleries
Rankings
Toggle Maps
Mapfiles
Toggle Surveys
List surveys
Stats
ITHEA Classification Structure > I. Computing Methodologies  > I.2 ARTIFICIAL INTELLIGENCE  > I.2.7 Natural Language Processing 
SPAM AND PHISHING DETECTION IN VARIOUS LANGUAGES
By: Liana Ermakova (2791 reads)
Rating: (1.00/10)

Abstract: The majority of existing spam filtering techniques suffers from several serious disadvantages. Some of them provide many false positives. The others are suitable only for email filtering and may not be used in IM and social networks. Therefore content methods seem to be more efficient. One of them is based on signature retrieval. However it is not change resistant. There are enhancements (e.g. checksums) but they are extremely time and resource consuming. That is why the main objective of this research is to develop a transforming message detection method. To this end we have compared spam in various languages, namely English, French, Russian and Italian. For each language the number of examined messages including spam and notspam was about 1000. 135 quantitative features have been retrieved. Almost all these features do not depend on the language. They underlie the first step of the algorithm based on support vector machine. The next stage is to test the obtained results applying trigram approach. Proposed phishing detection technique is also based on SVM. Quantitative characteristics, message structure and key words are used as features. The obtaining results indicate the efficiency of the suggested approach.

Keywords: spam, corpus linguistics, phishing, filtering, text categorization.

ACM Classification Keywords: I.2.7 Text analysis

Link:

SPAM AND PHISHING DETECTION IN VARIOUS LANGUAGES

Liana Ermakova

http://www.foibg.com/ijitk/ijitk-vol04/ijitk04-3-p02.pdf

Print
I.2.7 Natural Language Processing
article: SYNTACTIC OPERATIONS – MODELING LANGUAGE FACULTY · ON MENTAL REPRESENTATIONS: LANGUAGE STRUCTURE AND MEANING REVISED · IMPROVING AUTOMATIC SPEECH RECOGNITION ACCURACY BY MEANS OF PRONUNCIATION VARIAT · УНИВЕРСАЛЬНАЯ СИСТЕМА ПРОГРАММ МОРФОЛОГИЧЕСКОГО АНАЛИЗА НАУЧНО-ТЕХНИЧЕСКИХ ... · SPAM AND PHISHING DETECTION IN VARIOUS LANGUAGES · GRAMMATICAL PRIMING DOES FACILITATE VISUAL WORD NAMING, AT LEAST IN SERBIAN · MULTILINGUAL REDUCED N-GRAM MODELS · COGNITIVE MODEL OF TIME AND ANALYSIS OF NATURAL LANGUAGE TEXTS · IMPLEMENTATION OF DICTIONARY LOOKUP AUTOMATA FOR UNL ANALYSIS AND GENERATION · О МОДЕЛИРОВАНИИ ПОНИМАНИЯ · ФОРМАЛЬНОЕ ОПРЕДЕЛЕНИЕ СИТУАЦИИ ДЛЯ СЕМАНТ · THE EDUCATIONAL TECHNOLOGY FOR LEARNING FOREIGN WORDS · PARAMETERIZATION OF COMMENTS FROM PERUVIAN FACEBOOK AND TWITTER... · THE STUDY OF FACTORS RELADED WITH SINGLE-DOCUMENT KEYWORD EXTRACTION · AUTOMATED TAG EXTRACTION & CLUSTERING IN DOCUMENTS CONTAINING COMPOSITIONAL ... · STUDYING SPECIAL TEXT RUSSIAN CORPORA BY THE LEXICO-SYNTACTIC MODELS · STUDYING SPECIAL TEXT RUSSIAN CORPORA BY THE LEXICO-SYNTACTIC MODELS · CLASSIFICATION OF PRIMARY MEDICAL RECORDS WITH RUBRYX-2: FIRST EXPERIENCE · MACHINE TRANSLATION IN THE COURSE “COMPUTER TECHNOLOGIES IN LINGUISTICS” .. · CLASSIFICATION OF FREE TEXT CLINICAL NARRATIVES (SHORT REVIEW) · METHODS AND TOOLS OF COMPUTATIONAL LINGUISTICS FOR THE CLASSIFICATION ... · LEXISTERM – THE PROGRAM FOR TERM SELECTION BY THE CRITERION OF SPECIFICITY · ELECTION DATA VISUALIZATION · COMPUTER SUPPORT OF SEMANTIC TEXT ANALYSIS OF A TECHNICAL SPECIFICATION ON ... · MOBILE ELECTION · MOBILE SEARCH AND ADVERTISING · ALGEBRA LOGIC APPROACH TO PERSON’S THINKING MECHANISMS FORMALIZATION · COMPUTER SUPPORT OF SEMANTIC TEXT ANALYSIS OF A TECHNICAL SPECIFICATION ON DESIG · LSPL-PATTERNS AS A TOOL FOR INFORMATION EXTRACTION FROM NATURAL LANGUAGE TEXTS · NUMERIC-LINGUAL DISTINGUISHING FEATURES OF SCIENTIFIC DOCUMENTS · HIERARCHICAL THREE-LEVEL ONTOLOGY FOR TEXT PROCESSING · HIERARCHICAL THREE-LEVEL ONTOLOGY FOR TEXT PROCESSING · COMPUTER-AIDED SYSTEM OF SEMANTIC TEXT ANALYSIS ... · METHODOLOGY FOR LANGUAGE ANALYSIS AND GENERATION ... · ANALYSIS AND COORDINATION OF EXPERT STATEMENTS IN THE PROBLEMS ... · SEMANTIC SEARCH OF INTERNET INFORMATION RESOURCES ON BASE OF ONTOLOGIES ... · INTELLIGENT SEARCH AND AUTOMATIC DOCUMENT CLASSIFICATION AND CATALOGING ... · VERBAL DIALOGUE VERSUS WRITTEN DIALOGUE · INFORMATION PROCESSING IN A COGNITIVE MODEL OF NLP · EXPERIMENTS IN DETECTION AND CORRECTION OF RUSSIAN MALAPROPISMS BY MEANS ... · COMMON SCIENTIFIC LEXICON FOR AUTOMATIC DISCOURSE ANALYSIS OF SCIENTIFIC ... ·
Login
[ register | I forgot my password ]
World Clock
Powered by Tikiwiki Powered by PHP Powered by Smarty Powered by ADOdb Made with CSS Powered by RDF powered by The PHP Layers Menu System
RSS Wiki RSS Blogs rss Articles RSS Image Galleries RSS File Galleries RSS Forums RSS Maps rss Calendars
[ Execution time: 0.08 secs ]   [ Memory usage: 7.60MB ]   [ GZIP Disabled ]   [ Server load: 0.30 ]
Powered by Tikiwiki CMS/Groupware