summaryrefslogtreecommitdiff
path: root/bin/wiki/ImportarDesdeURL/node_modules/trigram-utils/readme.md
blob: 47f847c8ca7c0540ae9ae020fc917943cd1db5aa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# trigram-utils

[![Build][build-badge]][build]
[![Coverage][coverage-badge]][coverage]
[![Downloads][downloads-badge]][downloads]
[![Size][size-badge]][size]

Trigram language statistics utility functions, in their own repository to make
sure [`trigrams`][trigrams] (trigram info for the Universal Declaration of
Human Rights) and [`franc`][franc] (language detection) use the same cleaning
and classification methods.

## Installation

[npm][]:

```bash
npm install trigram-utils
```

## Usage

```js
var utils = require('trigram-utils')

utils.clean(' t@rololol ') // => 't rololol'

utils.trigrams(' t@rololol ')
// => [ ' t ', 't r', ' ro', 'rol', 'olo', 'lol', 'olo', 'lol', 'ol ' ]

utils.asDictionary(' t@rololol ')
// => { 'ol ': 1, lol: 2, olo: 2, rol: 1, ' ro': 1, 't r': 1, ' t ': 1 }

var tuples = utils.asTuples(' t@rololol ')
// => [ [ 'ol ', 1 ],
//     [ 'rol', 1 ],
//     [ ' ro', 1 ],
//     [ 't r', 1 ],
//     [ ' t ', 1 ],
//     [ 'lol', 2 ],
//     [ 'olo', 2 ] ]

utils.tuplesAsDictionary(tuples)
// => { olo: 2, lol: 2, ' t ': 1, 't r': 1, ' ro': 1, rol: 1, 'ol ': 1 }
```

## API

### `utils.clean(value)`

Clean a given string: strips some (for language detection) useless punctuation,
symbols, and numbers.  Collapses white space, trims, and lowercases.

### `utils.trigrams(value)`

Get clean, padded trigrams (see [`n-gram`][n-gram]).

### `utils.asDictionary(value)`

Get clean trigrams as a dictionary: keys are trigrams, values are occurrence
counts.

### `utils.asTuples(value)`

Get clean trigrams with occurrence counts as a tuple: first index (`0`) the
trigram, second (`1`) the occurrence count.

### `utils.tuplesAsDictionary(tuples)`

Transform an `Array` of trigram–occurrence tuples (as returned by
[`asTuples()`][as-tuples]) as a dictionary (as returned by
[`asDictionary()`][as-dictionary])

## License

[MIT][license] © [Titus Wormer][author]

<!-- Definitions -->

[build-badge]: https://img.shields.io/travis/wooorm/trigram-utils.svg

[build]: https://travis-ci.org/wooorm/trigram-utils

[coverage-badge]: https://img.shields.io/codecov/c/github/wooorm/trigram-utils.svg

[coverage]: https://codecov.io/github/wooorm/trigram-utils

[downloads-badge]: https://img.shields.io/npm/dm/trigram-utils.svg

[downloads]: https://www.npmjs.com/package/trigram-utils

[size-badge]: https://img.shields.io/bundlephobia/minzip/trigram-utils.svg

[size]: https://bundlephobia.com/result?p=trigram-utils

[npm]: https://docs.npmjs.com/cli/install

[license]: license

[author]: https://wooorm.com

[trigrams]: https://github.com/wooorm/trigrams

[franc]: https://github.com/wooorm/franc

[n-gram]: https://github.com/words/n-gram

[as-tuples]: #utilsastuplesvalue

[as-dictionary]: #utilsasdictionaryvalue