The R&D team at Fortia Financial Solutions has recently participated (and ranked 1st) in an open challenge at the 11th International Workshop on Semantic Evaluation 2017 (SemEval 2017). http://acl2017.org/

In the information age, the interaction of social media and news can have a significant impact in the financial sector (e.g. influencing the raise or fall in stock value). Thus, any public and private actor in the field could benefit from real-time insights derived from large-scale analysis of such information sources.

To this end, our team at Fortia, along with Italian researchers from Fondazione Bruno Kessler, took part in the SemEval 2017 Task 5.2, which focused on Fine-Grained Sentiment Analysis on Financial News, devising the best-performing system among ~40 submissions.

So, what is SemEval? With the first challenge (then named SensEval) dating back to 1998, the SemEval challenges have over the years established as the premier benchmark venue for Computational Linguistics. The 2017 edition includes 12 tasks, covering diverse domains such as medical, financial, and social media among others.

As exemplified in the image below, in a SemEval task data is collected and annotated by linguists or domain experts to produce a Gold Standard which contains the scores against which the systems performance will be evaluated).

Original Image from Wikipedia

For the task at hand, given a news headline containing one or more mentions of brands/companies, the goal was to devise an algorithm able to infer whether and to what degree such headline portrayed a Bullish(positive, going up) or Bearish (negative, going down) sentiment towards each mentioned brand/company.

A sample headline looks like:

Headline: “Morrisons book second consecutive quarter of sales growth”

Company name: “Morrisons”

Sentiment score: 0.43

Here, the algorithm receives Headline and Company name values as input and its goal is to output a value as close as possible to the given Sentiment score (the Gold Standard value), which ranges between -1 (very bearish) to 1 (very bullish).

Our winning approach combines Deep Learning techniques (GloVe embeddings, CNNs) with the DepecheMood affective lexicon, reaching a similarity score of 0.745 with the task Gold Standard.

Want more details? A paper describing our approach is now available on arXiv, and will be presented at the ACL 2017 conference in August.