untitled
<OAI-PMH schemaLocation=http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd> <responseDate>2018-01-15T15:42:54Z</responseDate> <request identifier=oai:HAL:hal-00329991v1 verb=GetRecord metadataPrefix=oai_dc>http://api.archives-ouvertes.fr/oai/hal/</request> <GetRecord> <record> <header> <identifier>oai:HAL:hal-00329991v1</identifier> <datestamp>2018-01-11</datestamp> <setSpec>type:ART</setSpec> <setSpec>subject:info</setSpec> <setSpec>collection:CNRS</setSpec> <setSpec>collection:UNIV-PARIS7</setSpec> <setSpec>collection:LIX</setSpec> <setSpec>collection:INALCO</setSpec> <setSpec>collection:X-LIX</setSpec> <setSpec>collection:X-DEP-INFO</setSpec> <setSpec>collection:X-DEP</setSpec> <setSpec>collection:X</setSpec> <setSpec>collection:PARISTECH</setSpec> <setSpec>collection:BNRMI</setSpec> <setSpec>collection:UNIV-AG</setSpec> <setSpec>collection:CEREGMIA</setSpec> <setSpec>collection:USPC</setSpec> </header> <metadata><dc> <publisher>HAL CCSD</publisher> <title lang=en>Soft memberships for spectral clustering, with application to permeable language distinction</title> <creator>Nock, Richard</creator> <creator>Vaillant, Pascal</creator> <creator>Henry, Claudia</creator> <creator>Nielsen, Frank</creator> <contributor>Centre de Recherche en Economie, Gestion, Modélisation et Informatique Appliquée (CEREGMIA) ; Université des Antilles et de la Guyane (UAG)</contributor> <contributor>Centre d'Études des Langues Indigènes d'Amérique (CELIA) ; Institut National des Langues et Civilisations Orientales (Inalco) - Université Paris Diderot - Paris 7 (UPD7) - Centre National de la Recherche Scientifique (CNRS)</contributor> <contributor>Groupe de Recherche en Informatique et Mathématiques Appliquées Antilles-Guyane (GRIMAAG) ; Université des Antilles et de la Guyane (UAG)</contributor> <contributor>Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX) ; Centre National de la Recherche Scientifique (CNRS) - Polytechnique - X</contributor> <description>International audience</description> <source>ISSN: 0031-3203</source> <source>Pattern Recognition</source> <publisher>Elsevier</publisher> <identifier>hal-00329991</identifier> <identifier>https://hal.archives-ouvertes.fr/hal-00329991</identifier> <source>https://hal.archives-ouvertes.fr/hal-00329991</source> <source>Pattern Recognition, Elsevier, 2009, 42 (1), pp.43-53. 〈10.1016/j.patcog.2008.06.024〉</source> <identifier>DOI : 10.1016/j.patcog.2008.06.024</identifier> <relation>info:eu-repo/semantics/altIdentifier/doi/10.1016/j.patcog.2008.06.024</relation> <language>en</language> <subject lang=en>Spectral clustering</subject> <subject lang=en>Soft membership</subject> <subject lang=en>Stochastic processes</subject> <subject lang=en>Text classification</subject> <subject>ACM H.3.3</subject> <subject>[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]</subject> <type>info:eu-repo/semantics/article</type> <type>Journal articles</type> <description lang=en>Recently, a large amount of work has been devoted to the study of spectral clustering — a powerful unsupervised classification method. This paper brings contributions to both its foundations, and its applications to text classification. Departing from the mainstream, concerned with hard membership, we study the extension of spectral clustering to soft membership (probabilistic, EM style) assignments. One of its key features is to avoid the complexity gap of hard membership. We apply this theory to a challenging problem, text clustering for languages having permeable borders, via a novel construction of Markov chains from corpora. Experiments with a readily available code clearly display the potential of the method, which brings a visually appealing soft distinction of languages that may define altogether a whole corpus.</description> <date>2009-01</date> <contributor>ANR-07-BLAN-0328-1, ANR-07-BLAN-0328-1</contributor> </dc> </metadata> </record> </GetRecord> </OAI-PMH>