Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplest Possible Product #3

Open
MichaelPaulukonis opened this issue Apr 18, 2016 · 6 comments
Open

Simplest Possible Product #3

MichaelPaulukonis opened this issue Apr 18, 2016 · 6 comments

Comments

@MichaelPaulukonis
Copy link
Contributor

I don't have major big ideas this year, none that I have any chance of completing in the midst of everything else I'm doing.

So -- I'm going to plan connect up a "poem" generator to a Tumblr-bot.

The generator will likely be a minor elaboration on what I did last year -- intake, processing, output.
The wire-up will give me daily results to look at an tweak, as an on-going goad.

So: no giant contributions to the art.

As usual: incremental advances.

@MichaelPaulukonis MichaelPaulukonis changed the title Participation Simplest Possible Product Apr 20, 2016
@MichaelPaulukonis
Copy link
Contributor Author

MichaelPaulukonis commented Apr 20, 2016

So, I've been looking at the Lexeduct code that Chris Pressey started last year. I didn't look into it enough at the time, and my work with it was at cross-currents to its ideology (my work last year was in the gh-pages branch, and "worked", even though it doesn't fit the main model in the master branch).

I think wrangling that understanding and applying it to want I currently want to do will be too time-consuming (although profitable).

So, I'm going to do the Simplest Thing That Could Possibly Work.

  1. static text generator posts to Tumblr
  2. text generator becomes non-static
  3. elaborate and iterate on step 2
  4. end-goal includes ingestion of source material from online news

@MichaelPaulukonis
Copy link
Contributor Author

MichaelPaulukonis commented Apr 21, 2016

http://poeticalbot.tumblr.com/

  • static text generator posts to Tumblr
  • text generator becomes non-static
  • elaborate and iterate on step 2
  • end-goal includes ingestion of source material from online news

https://github.com/MichaelPaulukonis/NaPoGenMo2016

@MichaelPaulukonis
Copy link
Contributor Author

MichaelPaulukonis commented Apr 26, 2016

Initial non-static implementation is a headless version of Edde Addad's jGnoetry using my already derived work.

Also, it's running live on Heroku with a once-a-day scheduler.

So without further intervention -- we'll have a poem-a-day.

Futher elaborations will be:

  • Source texts
    • Howl
    • Wizard of Oz
    • Apocalypse Now Redux (script)
  • templates
    • Howl
    • MSDN page fragment
  • semi randomized options (certainly corpus percentages)
    • corpus percentages
    • other options
  • title generator
    • template name + corpus percentages
    • 1,4..10 of most common words > 3 chars ALL CAPS

@MichaelPaulukonis
Copy link
Contributor Author

MichaelPaulukonis commented Apr 27, 2016

Some examples:

quatrain 0 1 98 1

The filmed culture teaches as reporting
On average annual increase the decisions
They would otherwise ensure. These students
To use of the challenge in america.

howl 29 39 26 6

Cut out of the ills

Infringements

A bag find those holy want is true my perspective, even though not something sweet hue is. Love groan

It easier article had mastered of thousands just cut out the

Gentle original a bag thought thee, increasingly copy as and there you. Shake gently manipulate Pavements,, even if

I schools and the United a dream the order in the web-log, beauty’s the sun and put them to mine. Next my article audit the bag. Copy conscientiously in a canker streetlight

In chapter a newspaper. Then one particularly seem like endless there is his triumphant prize. And, even thought where you’re order flowers

Saw that makes a newspaper. The order which cannot just and to make of a world where is up, even though unappreciated by the breath

That coughs o. Take a kid. If thou. Take and honor make your self alone sensibility, the worst one after the [etc. etc. etc.]


There's something weird with the opening of the Howl template; have to look at that some more.

The corpus weights should be able to select 1 only, some of the time.

The title is the template name, plus the corpus weights. Since nobody knows what those are, it's pretty much magic numbers, but that's fine by me for a naive algorithm (I'm reminded of how Edde Addad would use GUIDs as names. Unwieldy, but... unique.).

Looking at taking initial words, highest frequency words, other options.

@MichaelPaulukonis
Copy link
Contributor Author

MichaelPaulukonis commented Apr 28, 2016

I think the Howl template is... too big? It's fun, but it's a blast of words, and goes on too long.

Last night I added a new template - derived from the start of a MSDN page, no less, source texts of The Wizard of Oz and a script to Apocalypse Now, a new title algorithm, and some tweaks. The script may be odd. OTOH, that may lend some interesting frisson. Maybe if there were more scripts?

And again -- all of this is markovian. Which has its pleasantries, but also wears thin and obvious rather quickly.


CALCULATED CORRECTLY THAT

He calculated correctly that helped
Amplify text. Today you have developed.


I want to get some more title algorithms and switch them up, and get more of the parameters randomized. They won't be big changes -- but will affect capitalization and punctuation a bit. Also, have the ability to pick one or two texts from the corpus; the current algorithm can randomly assign some to 0%, but I'd like most of them to be 0% on occasion.

The title algorithm tracks the most common words in the text, selects some of them (randomly, but weighted for 4..10 if that many words are present). It was biased towards common words like "and, of, or the, is, an" etc - so I had it ignore words < 4 characters. Naive (since "this, that, those" slip through), but quickly workable. There's a big fat library that can pick out topic words, but at the moment it seems a large dependency. OTOH, there's a lot of room in heroku to add in libs....

I've turned the bot on to fire hourly during development, but it will go back to once-a-day when the dust settles.

I want some other poetry generation algorithms, and really want to tweak the template generation and processing -- would like to do some fill-in-the-blanks. That is, have words present in the template that are spit back out. Currently, any non-template token in the template is treated as a reason to templatize the input text.

@MichaelPaulukonis
Copy link
Contributor Author

Just be clear, the code is auto-posting once an hour to http://poeticalbot.tumblr.com/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant