Twitter bots, twitter bots, twitter bots, folks

After the incredible success of @SomeHonMembers, I decided to create @HonSpeakerBot, which was not nearly as popular, but whatever.

The lack of any transcripts or any data made a bot for #TOpoli difficult. But now, there are two. You may have seen them around as I was testing them.

Mayor Robot Ford hates streetcars

This idea is based on a popular Japanese twitter bot by the name of @akari_daisuki. What Akari does is takes a random Wikipedia article title or something from her timeline. Let’s call this thing $x$. Every fifteen minutes, she tweets the Japanese equivalent of “Yay $x$, Akari loves $x$”.

The idea behind @MayorRobotFord is similar. He takes a random Wikipedia article title or something from the #TOpoli stream and tweets, in his familiar way, “$x$, $x$, $x$, folks. We’re getting $x$.” or “People want $x$, folks. $x$, $x$.”, depending on how long $x$ is.

How does it work? Well, the Wikipedia portion is easy enough. We just get access to the API to grab a list of ten random pages. The more complicated part is pulling stuff off of #TOpoli.

Since this is a twitter app, it’s not too hard to get a bunch of tweets off of #TOpoli. We just use the API and we’ve got the fifteen latest #TOpoli tweets ready to be used. The difficult part is extracting noun phrases, or NPs, which is where that graduate class on computational linguistics comes in handy.

So how do we do this? Well, first of all we have to identify the parts of speech in a given tweet. So we tokenize it first and split it up into words and punctuation. Then, we use a part-of-speech tagger to go through and figure out what each little piece is. The POS tagger that I used was the default one that came with the Natural Language Toolkit. Normally, you’d need to train a tagger on a corpus. This default one was trained on the Brown corpus, which is a body of text which was hand tagged for training purposes.

So now our tweets are all tagged and we assume that they’re properly tagged. There’s obviously going to be some slight errors here and there, but whatever, we want to make a twitter bot, so it’s not that important. But we only have our parts of speech. We want to be able to relate the different parts of speech into phrases. So we need some kind of parsing or chunking to put these pieces together into larger phrases that make sense.

For this, I used a bigram chunker trained on the conll2000 corpus. Like the Brown corpus for tagging, the conll2000 corpus is manually parsed and chunked for training purposes. What a bigram chunker does is it analyses every consecutive pair of words in a sentence to come up with a statistical model. It uses this to come up with the most likely NPs to arise from the sentence. We can then just pluck out all of the NPs the chunker identifies.

Once we have all of our NPs, we stick them in a list with our Wikipedia titles and randomly select one to use in our tweet. The Wikipedia API has a limit of 10 titles per call and the twitter API grabs 15 tweets per call. Thus, the chance of getting a Wikipedia title is at best somewhere around 2/5 of the time. However, that’s not taking into account removing entries that are too large. That quick calculation also assumes that there’s only one NP per tweet when there could be many, so in reality, the chance of grabbing something from #TOpoli is much more likely, which might be for the best if you want weird accidental metacommentary.

The Core Service Review

One day, I decided to look through the City of Toronto’s open data catalogue and happened upon an interesting entry called the Core Service Review Qualitative data.

Lo and behold, it was exactly that.

After some fiddling around with an Excel module for Python and figuring out how to split tweets that are larger than 140 characters, I let it go.

@TOCoreService will tweet one entry, randomly chosen, from the 12000 submissions, or close to 58000 answers. These range from short answers like “transit” or “taxes” to fairly lengthy responses.

So what’s the point of this bot? Well, the data is up there for anyone to read, which is nice for transparency and engagement. Of course, whether anyone who’s not city staff would want to read 13000 responses is another matter. But here, we pretty decent collection of opinions on what our priorities should be from real citizens. It’d be a shame if the only people who read them were city staff.

Toronto City Council, 2012

2012 has been a hell of a year, especially if you’re into the city council scene in Toronto. Basically, the year in Toronto politics can be summed up by the following neat graph.

Ford, 2012

Yikes.

What you’re looking at is a similarity graph of recorded votes in city council, from October 2011 to October 2012. An edge is drawn between two councillors if they voted the same way 90% of the time. The edge is coloured blue if they voted together 92.5% of the time and it’s green if they voted together 95% of the time. Remember, the last time we did this, the graph looked kind of like this:

Ford, 2011

Let’s refresh our memory of the first year of the Ford council. Most councillors were willing to work with Ford in the face of his relative popularity at the time. Right-wing and centrist councillors tried to position themselves to gain the mayor’s favour and the mayor had a pretty easy time getting his agenda through. With little effort, he was able to repeal the vehicle registration tax and put an end to Miller’s Transit City plan. It seemed like we were in for a long four years.

But then, something happened over the summer. In his quest for efficiencies, the mayor had actually dove into the realm of cutting services. People didn’t like that. After all, the mayor had promised he could reduce spending without cutting services. And it’s here where the mayor and the citizens diverged, on where the line between finding efficiencies and cutting services was drawn. And so, the mayor’s popularity dropped.

And then there was the hilarious Port Lands thing, but whatever.

Anyway, fast forward to January 2012, when the vote on the budget is taking place. Via some fascinating political manoeuvring, a majority of councillors were able to reverse some of the mayor’s planned cuts. In February, a majority of councillors, again, reverse the mayor’s plans and performed some necromancy on the Transit City LRT lines. Council had realized sometime in the preceding months that the mayor was no longer the threat that he was at the beginning of the term and his refusal to compromise on some very reasonable points made him look worse.

And so, 2012 has played out, with Council taking the task of governance into its own hands, without the guidance of the mayor.

So how different does the dynamics of council look after 2012? Here’s a graph that takes all of the data from the beginning of Ford’s term up until the last council meeting on October 4, 2012.

Ford, 2011-12

What will this graph look like by the end of the term, in 2014? It’s hard to say. Remember, we were all expecting 2011 to be the new normal, until 2012 hit. Who knows what could happen in another year. The alliances at council are always shifting and there’s always the temptation for some councillors to go out on a limb and inadvertently blow something up in the process.

Well, there is one thing, which is that the data could look significantly different because of a structural change made at the last council meeting. In Ford’s council, the mayor insisted that every single vote be a recorded vote, in an effort to improve accountability and transparency. Of course, what this means is a blowup in the number of recorded votes for things like speaking extensions. During the last council meeting, it was decided that speaking extensions would be done away with. This will likely affect the data because most councillors usually just vote yes. Well, except for the one councillor who always votes against speaking extensions: the Midnight Mayor, Mark Grimes.

Bonus: Miller, 2009-10

Toronto’s council voting records go all the way back to the beginning of 2009. Since I was already clicking endlessly to download the voting records for the year, I thought it’d be neat to see how different council was back in the final days of the Miller council. The time period represented here is from January 2009 to the final council meeting in August 2010.

Miller, 2009-10

The most striking thing is, of course, where the former councillor for Ward 2 can be found.

It was kind of tough to figure out a good threshold for this dataset because the differences in voting were much, much greater, almost certainly caused by Ford’s insistence on recorded votes for speaking extensions. Here, an edge is drawn between two councillors if they voted together at least 75% of the time. If they voted together more than 85% of the time, then the edge is orange, and the edge is red if they voted together more than 90% of the time. Of course, the colour scheme was chosen to reflect the evil New Democratic Communists running council at the time. Since I didn’t really pay attention to council during those years, I don’t have much to say, but I’m sure that if you did, you’ll find some interesting quirks.

Correspondence data

God in $n$-space

So here’s a question about the nature of God that’s probably atypical. But I should probably preface this by saying that this is purely an academic exercise and thought experiment and that I’m not really looking to establish any deep theological truths. It’s entirely possible that I’m horribly wrong.

One of the things Christians do when describing God’s eternal nature is to say that because he has no beginning and no end, he exists outside of time.

I’ve never really understood what this meant.

The rationale for this kind of explanation is that our finite minds can’t comprehend infinity. As a mathematician, that notion seems kind of silly. Here, I’ll give an example of something we’re all familiar with that doesn’t have a beginning or end: $\mathbb Z$, the set of integers. We’re even able to distinguish between different cardinalities of infinity and have developed useful number systems in which we can, yes, divide by zero and get infinity as a legitimate result. So what’s the problem?

To me, the notion that God exists outside of time is like saying God exists outside of space. No one seems to have a problem with the second one, after all, omnipresence is one of God’s attributes. This is an idea we can use.

I’m sure we’re all familiar with the concept of 3-space, or $\mathbb R^3$, which is how we describe the three physical dimensions of space and all. So God’s omnipresence in 3-space would just mean that he’s present in every point in $\mathbb R^3$.

But mathematicians aren’t satisfied with stopping at $\mathbb R^3$. We like to generalize, which is where we get into things like $\mathbb R^n$ for some integer $n$. Or how about even $\mathbb C^n$? So now we’ve got $n$-dimensional space to deal with. That’s hard to wrap your head around if you try to think of it in analogous physical terms (because there aren’t any). Anyhow, we don’t even have to stop at finite-dimensional spaces, we can extend things to infinite-dimensional spaces.

Whether or not these things actually physically exist isn’t that important. We’re just concerned with this: how does God’s omnipresence translate when we extend space to however many dimensions? It’s simple, he’s still present at every point in space.

So what if we take one of those dimensions to be time? I mean, a lot of people often like to think of time as the fourth dimension.

Then God is present at every point in time as well. For me, thinking about it this way actually answers a question I did have for a while: what is meant by God’s unchanging nature? This is one of those questions that the outside of time thing was meant to “answer” but it doesn’t actually answer anything, since it really just handwaves it away. But with the dimensionality angle, we can say that God is the same entity at every point in time.

I’m sure there are plenty of other questions that arise from thinking about it like this, but, at least for me, the advantage in this approach is that it’s analogous to ideas we’re already comfortable with, namely God’s omnipresence in $\mathbb R^3$. It explains why God can change his mind and direct things at multiple points in time if he wanted to.

So this thought experiment led an interesting question on prayer. One component of prayer is that Christians often petition God to act in some way, in the present or future. But if God is all-powerful and ever-present, does it make sense to pray for things that occurred in the past?