It would be easy to write software to parse that particular phrasing of that particular relationship ("Company X in Talks to Buy Company Y"), but to catch any arbitrary relationship between two entities in a given domain? That's no longer an easy problem. Not to mention that you'll need to represent that world knowledge coherently in ontologies / knowledge bases and write complex logic around it to make the data actionable.
All you need is a positive correlation between an action taken on a weighted evaluation of a headline containing key words, and profit. You're overestimating the difficulty of the problem. It's model training; there's lots of historical data to tune on.
> It's model training; there's lots of historical data to tune on.
Is there a timestamped archive of DowJones Newswire articles? I found it difficult to find archived news articles the last time I was looking for them.
Sentiment analysis has plenty of research around it, and once you've got a big enough training and validation set, it can give very good results. My old boss works for a company that runs sentiment analysis on comments made about companies to automatically highlight positive/negative messages - it's not a massive leap from there to repurpose that to analyse positive or negative news reports about a company.
I very much feel this way. I also live (but did not grow up in) a country where the average yearly income is ~800$. So, I guess my neighbors shouldn't get to experience all the interesting media that I did?
Also, there are many things that I feel I have an obligation to be informed about (like junk on CNN / Fox News, or war propaganda like "Zero Dark Thirty" or "American Sniper"), but I would be ethically opposed to paying for.
The string matching is fuzzy though. It is case insensitive and it ignores at least some non-alphanumeric characters.
For example, "twenty-four" will return results about the TV show "Twenty Four" as well as results for "twenty four" and "twenty-four".
There's some other maddening exceptions for those who would prefer exact string matching, but I can't remember them all and it's changed over the years too.
Between the famous "whenever I fire a linguist..." quote and the "fuck computational linguistics" meme, I would be hesitant to reveal my linguistics background in a interview.
I wouldn't worry about it too much. It was a different time back then, when work was mainly guided by linguistic theory. I think that linguistic background is a plus.
The quote (maybe not his) means that linguists used to ignore actual language usage.
"Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-community, who knows its (the speech community's) language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of this language in actual performance." ~Chomsky,1965