193 Commits

Author SHA1 Message Date
Leonard Richardson
91836cd806 Added a script to update corpora. 2018-07-24 23:35:35 -04:00
Leonard Richardson
0466ff68fd Merge branch 'setupify' of https://github.com/leonardr/olipy into setupify 2018-07-24 17:09:01 -04:00
Leonard Richardson
c49b54b3ca Merge pull request #2 from samplereality/master
Updating to match new textblob package name
2018-07-24 17:08:29 -04:00
Leonard Richardson
89d57fc798 Moved texts to words/literature. 2018-05-06 18:14:01 -04:00
Leonard Richardson
f183fd6e8e Converted board games and Apollo 11 to the corpora format. 2018-05-06 14:11:55 -04:00
Leonard Richardson
7df7a9297e Consolidated the word lists where possible and put them into the same format used by corpora. 2018-05-06 13:17:08 -04:00
Leonard Richardson
61e771dac5 concrete_nouns and nouns_by_frequency are redundant. 2018-05-06 12:35:58 -04:00
Leonard Richardson
a029d0f116 Moving corpora-more into directories to match corpora-original. 2018-05-06 12:35:30 -04:00
Leonard Richardson
d7abe1cb03 renamed data to corpora-original in the loader code. 2018-05-06 09:47:37 -04:00
Leonard Richardson
324ba0d0b2 Brought corpora-original up to date with https://github.com/dariusk/corpora revno d6f705c150ba16a8848284259a9eeda5868b8cf3 2018-05-06 09:37:15 -04:00
Leonard Richardson
bdbaa0e84a Move more-corpora to corpora-more so it will sort near corpora-original 2018-05-06 09:28:54 -04:00
Leonard Richardson
2637c5ddff First stab at turning this into something that can go on PyPI. 2018-05-04 17:08:10 -04:00
Leonard Richardson
8744120d27 Merge branch 'master' of https://github.com/leonardr/olipy 2017-08-27 09:25:01 -04:00
Leonard Richardson
f7f5fc8bb5 Fix a bug that prevented the entire beginning from being yielded when the order was greater than one. 2017-08-27 09:24:36 -04:00
Leonard Richardson
893bf3cae7 Pad to an integer size. 2017-07-30 15:06:16 -04:00
Leonard Richardson
4a8f9503bf Got gibberish.py to pass a stress test. 2017-07-30 14:58:41 -04:00
Leonard Richardson
fd90004dab Trying to clean up the README. 2017-06-25 09:20:32 -04:00
Leonard Richardson
d53fd6f740 Added list for content filter. 2017-06-24 14:54:15 -04:00
Leonard Richardson
2e7321b882 Moved old datasets into more-corpora and got all the example scripts to work. 2017-06-24 12:49:39 -04:00
Leonard Richardson
0975085b7d All work and no play makes Jack a dull boy. 2017-06-24 12:00:39 -04:00
Leonard Richardson
1b6728f639 Added a helper method to pad a string with random whitespace characters. 2017-06-24 11:48:30 -04:00
Leonard Richardson
6d8b30ed23 Added example script for corpora. 2017-06-24 11:41:46 -04:00
Leonard Richardson
4250090ba1 Got more-corpora into shape. 2017-06-24 11:31:50 -04:00
Leonard Richardson
216a87b14f Added an incomplete mapping of Unicode glyphs to other glyphs that visually resemble them. 2017-02-18 08:37:59 -05:00
Leonard Richardson
ab47b6d3c9 remove set_trace call. 2016-05-12 01:48:33 +00:00
Leonard Richardson
ca25a9f5be Attempt to use COMBINING GRAPHEME JOINER to preserve whitespace in art. 2016-05-11 21:45:04 -04:00
Leonard Richardson
bf7c822ded Added a custom alphabet for skin tones. 2016-05-11 21:37:42 -04:00
Leonard Richardson
f2a74a0e51 Added the sampler gibberish. 2016-03-27 14:41:30 -04:00
Leonard Richardson
61e7e205af Added ModifierGradientGibberish, where the alphabet is held constant but the modifier in use shifts from one to the other. 2016-03-27 09:23:18 -04:00
Leonard Richardson
0aff58ce48 The SingleModifierGibberish is now the LimitedModifierGibberish and can have a choice of more than one modifier to use. 2016-03-27 08:57:39 -04:00
Leonard Richardson
f6dad9a2bf Added different types of whitespace to limited-vocabulary alphabets. Added mirrored mosaics from non-tilable alphabets. Added mosaics that are primarily whitespace. 2016-03-27 08:52:00 -04:00
Leonard Richardson
d1c330d7a2 Fixed merge conflict. 2015-06-21 08:56:45 -04:00
Leonard Richardson
a03b03afad Integrated mirrored mosaics into the Smooth Unicode workflow. 2015-06-21 08:55:27 -04:00
Leonard Richardson
f764754bec Refactored the mosaic code. 2015-06-21 07:43:42 -04:00
Leonard Richardson
d139178141 Refactored the mosaic code. 2015-06-21 07:41:15 -04:00
Leonard Richardson
d9e836ee95 Got the mosaic code semi-working again. 2015-06-11 06:57:26 -04:00
Leonard Richardson
492a747557 Started work on a module for generating symmetrical tile patterns. 2015-06-11 06:31:33 -04:00
Mark Sample
dddfdd1b18 Update ebooks.py
Fixed textblob import to match new textblob packagename
2015-02-13 21:57:29 -05:00
Mark Sample
a181371dcf Update tokenizer.py
Changed text.base to textblob.base to match new package name of textblob
2015-02-13 21:56:34 -05:00
Leonard Richardson
532f500e04 Don't put the last item in the middle bucket. 2014-10-01 18:09:15 -04:00
Leonard Richardson
351a78197b Fixed a bug in the Markov generator that stopped the example from working. 2014-09-22 20:31:33 -04:00
Leonard Richardson
51e9fa89b1 Added gibberish in which every character is modified with the same modifier. 2014-04-06 14:47:49 -04:00
Leonard Richardson
f27de48210 Merge branch 'master' of https://github.com/leonardr/olipy 2014-04-06 12:01:52 -04:00
Leonard Richardson
977f3b00af Added emoji support to gibberish. Fix rare ebooks glitch that yielded the same quote twice on small corpora. 2014-04-06 12:01:40 -04:00
Leonard Richardson
45b0eeef34 Moved example to docstring. 2014-01-21 19:39:47 -08:00
Leonard Richardson
f4a559f200 Made public the tricks I've developed to measure the quality of _ebooks quotes. 2013-12-30 20:38:35 -05:00
Leonard Richardson
016ef517fa Added a better tokenizer. 2013-12-30 15:32:08 -05:00
Leonard Richardson
6428d6fb5a Fixed typo. 2013-12-29 12:15:06 -05:00
Leonard Richardson
cdc20249ac Moved the list of bad words to word-lists. 2013-12-29 12:14:49 -05:00
Leonard Richardson
6ac9136f6f Filtering is not good enough to promise anything other than 'adjectives' and 'nouns'. 2013-12-29 12:13:20 -05:00