# Making NFAs smaller

So, you might be wondering what I’m working on now. Basically, I’m looking at algorithms that are motivated by the idea of reductions of nondeterministic finite automata.

The problem of NFA reduction is basically to take an NFA and make an equivalent NFA with fewer states. The natural question to ask is whether we can find an equivalent NFA with the minimal number of states, which is the NFA minimization problem. There are some good reasons that we focus on the easier problem. The first is that unlike DFAs, there is no unique minimal NFA. The second reason is that NFA minimization is super hard, as in PSPACE-complete (if you know how NP is “hard”, well, PSPACE is “harder” or “really friggin hard”).

Well, okay, but what if we’re not so worried about getting it as small as possible and we just want to get the damned thing smaller. If we can’t guarantee that they’re as small as we can make them, is there some kind of measurement to tell us how far away we are? Or are there special cases of NFAs where we can get the smallest one? And it turns out the answer is probably almost always no. Of course, that won’t stop us because there reducing the number of states in an NFA is pretty useful.

There are a number of ways to do this and it’s not known which way is the best way, otherwise, we could work on approximation or minimization instead. A lot of solutions take the approach of turning the NFA into a DFA, minimizing it, and then turning it back into an NFA and seeing what we get. This is nice, because we know there’s a smallest DFA and it’s not too hard to find. The problem with this is that the growth of the number of states in the NFA to DFA transformation can be exponential in the worst case. We’d like to avoid this.

The idea is to try finding states that are superfluous. What do we mean by superfluous? Well, if I have a word $ababa$ and I can end up in either a state $p$ or a state $q$, then we probably don’t need two different states. Similarly, if we start in a state $p$ or a state $q$ and use up the word $aaaba$ to get to a final state, we might not have needed two different states for that. This is essentially the kind of thing we do in DFA minimization, but in that case it’s a lot easier because the state transitions are deterministic.

What we’d do formally is define a relation that captures this idea. For an NFA $N=(Q,\Sigma,\delta,q_0,F)$, we’d define some relation $R$ that satisfies the following conditions:

1. $R\cap(F\times(Q-F))=\emptyset$
2. $\forall p,q\in Q, \forall a\in \Sigma, (pRq \implies \forall q’\in\delta(q,a), \exists p’\in \delta(p,a), q’Rp’)$.

Condition 1 rules out final states being related with non-final states. Condition 2 tells us that if two states $p$ and $q$ are related, then every state that $q$ has a transition to will be related to some state that $p$ transitions to on the same letter. As long as these two conditions are held, then we can define any crazy old relation to work with and gather up our states into a bunch of states.

This is essentially what we do when we want to minimize DFAs. The difference is that it’s a lot easier to define this relation because transitions only have one outcome in the deterministic case. In fact, there’s a specific relation, called the Myhill-Nerode equivalence relation, which has the property that $L$ is a regular language if and only if the Myhill-Nerode relation has a finite number of equivalence classes. And these equivalence classes turn out to correspond to the states in the minimal DFA for $L$.

However, like was mentioned before, we have no such luck with NFAs. We have a lot of relations we can choose from and it’s not clear which one of these is the best or things like which combination or how to iterate them so that we can get the best one. This is where the hard part is, figuring out which of these things gets us the smallest NFA in the most efficient way possible.

Back in fall, I had to do a project for my social network theory course. Not too long before that, I saw a neat post on CalgaryGrit that talked about how to identify similarly aligned groups and councillors on Toronto City Council based on their voting records. With the help of the guy who actually did that analysis and his scripts, I was able to compile my own data from Toronto’s council voting records which were conveniently offered as part of the city’s open data.

And here’s what came out:

This is a graph of Toronto City Council, where edges between two councillors means that they have a high degree of similarity in their voting records. The period considered is from the beginning of the term, December 1, 2010 up until September 30, 2011, which was the last day of voting data that I had available before I had to start work on the project. An edge means that the two councillors voted the same way at least 90%. An edge is coloured blue if that goes up to 92.5% and it’s coloured green if it’s at least 95%. Or you can think of those numbers as 10%, 7.5%, and 5% different. Same thing.

It’s important to note that the edges aren’t weighted. This means that the physical distance of the councillors doesn’t actually mean anything. That Rob Ford is drawn at the top right corner is a coincidence and it doesn’t mean that Denzil Minnan-Wong is more similar to the mayor than Doug Ford. Similarly, it doesn’t mean that Paula Fletcher is necessarily the councillor least similar to the mayor just because she’s the furthest away in the drawing.

Of course, the drawing is oriented such that you can make less fine-grained observations. It’s pretty easy to see which councillors are the right or left wing and which ones reside in the middle. We can also see that the right wing of council votes together a lot more than the left wing. We can even kind of identify which councillors are most likely to break with their respective groups.

In order to get more detailed analysis, we need to look at actual numbers. There are a bunch of graph theoretic metrics that we can use together with social network theory to talk about how certain groups or councillors might act. I get into that sort of stuff in my writeup. Of course, a lot of that is handwaving since I don’t really get hardcore into stats and a lot of it is political analysis, which I’m kind of an amateur at.

Interestingly, I concluded my writeup by saying something like how, based on all of the analysis and numbers, the Mighty Middle didn’t really show any signs of life. And even if it did, it wouldn’t get very far since the number of votes Ford had on his side was pretty stable and was at the point where he’d only need one or two votes.

Well, council sure showed me. One of the first things that happened in 2012 was the Mighty Middle that I said didn’t really exist covertly orchestrated enough votes to reverse a bunch of the mayor’s cuts. So what does this mean?

Well, the most obvious lesson is that you can’t predict what’ll happen in volatile political scenarios based just on data. Obviously, the situation seemed stable, both from the perspective of the data and people watching city hall. I mean, the whole budget thing surprised everyone.

Of course, the mistake there was to assume we were in stable situation when we weren’t. Yeah, the mayor had a lot of safe votes, but he was still missing one or two. Everyone’s been treating this as the mayor having an unfortunate stranglehold on council even though in a parliamentary setting, missing one or two votes is still enough to throw everything into uncertainty (see Queen’s Park). Being a non-partisan legislature, this volatility should’ve been more expected. But that tiny gap managed to fool everyone, even though all it’d take is one or two councillors to ruin everything for the mayor.

Interestingly enough, the Mighty Middle councillors did do what I said they’d be good at. They were able to use their position to broker a compromise that managed to peel off enough votes from the mayor. When they did combine their powers, they were really effective. But also notice that the coalition they forged was still really fragile. Even on the day of the vote, it wasn’t completely slam dunk for the opposition until the vote was done. All it would’ve taken to ruin everything was one or two councillors, which is something I did note in my writeup. Of course, like I said earlier, this caveat and margin of error holds for the mayor’s side as well.

And here’s where numbers aren’t enough to capture everything. Once someone decides to rock the boat, as the Mighty Middle did in late January, it becomes easier and easier. All of a sudden, the mayor’s grip on council doesn’t seem that bad and in February, we have the TTC chair and (former) Ford ally Karen Stintz turn rally council to overturn the mayor’s transit plan.

This is obviously a huge vote. Unfortunately, my model doesn’t weigh votes according to their importance, so this is treated the same as any other banal vote like voting on extensions of speaking time or something. I did throw in some stuff about possible ways of classifying votes, but that’s already hard to do in an objective way. Figuring out how to weigh the importance of votes is probably even harder to do automatically.

None of that changes that I was way off in my conclusion. Of course, I’m thrilled at what’s been happening at city hall, so I guess I should be happy that all of this happened right after I finalized my writeup and got marked on it.

If you’re interested in reading about how wrong I was, here’s the writeup.