Natural Language Processing API

  • SECTIONS
  • Overview
  • API Definition
  • API Documentation
  • Code Snippet
  • Changelog
  • Sample Use Case
Overview

The Natural Language Processing API is a collection of services that analyze and add value to unstructured text.  These APIs leverage Artificial Intelligence, Machine Learning, Deep Learning, Natural Language Processing, and Natural Language Generation techniques.


AI Text Summarization

The AI Text Summarization service takes text as an input and returns a computer-generated summary that represents the most important information in the original content.

AI Themes

The AI Themes service identifies significant concepts in unstructured text.  AI Themes can be used to quickly understand the contents of a document, as a first step in determining the sentiment of a document, or to discover when new ideas emerge. Every theme is paired with a numeric value to indicate its significance within the document, and can additionally be paired with a sentiment score.

Named Entity Recognition

The FactSet NER (Named Entity Recognition) service identifies companies, people, locations, health conditions, drug names, numbers, monetary values, and dates from unstructured or semi-structured documents. In addition to providing the text and type of the entity names, along with their start and end offsets in the document, this service also provides the best matching FactSet identifiers for companies and people found in the text. This unique FactSet identifier allows you to link any document with other FactSet content sets, such as historical prices or fundamental data.

Question & Answer

The Question & Answer API is a machine learning service that provides the ability to ask questions about the information contained in text documents. The service will return answer(s) based on the information contained in the text.

API Definition
API Documentation
Code Snippet
AI Text Summarization Sample
import time
 
import requests
from requests.auth import HTTPBasicAuth
 
auth = HTTPBasicAuth(‘user-name’, '<password>')
 
def extract_from_text(doc_text,
                      fds_summarization_url='https://api.factset.com/cognitive/nlp/v1/summarization'):
 
    payload = {'text': doc_text}
    resp = requests.post(f"{fds_summarization_url}/summary",
                         json=payload, auth=auth)
    if not resp:
        status_code = resp.status_code if (resp is not None) else -1
        raise ValueError(
            f'Received unexpected response from service: status_code: {status_code}')
 
    result_id = resp.json()
 
    while(True):
        resp = requests.get(
            f"{fds_summarization_url}/result/{result_id}", auth=auth)
        if not resp:
            status_code = resp.status_code if (resp is not None) else -1
            raise ValueError(
                f'Received unexpected response from service: status_code: {status_code}')
        if resp.json() != 'Processing':
            return resp.json()
        time.sleep(1)
AI Themes Sample
from requests.auth import HTTPBasicAuth

def extract_from_text(doc_text,
                      min_doc_text_size= 10,
                      fds_themes_svc_url= ‘https://api.factset.com/cognitive/nlp/v1/themes’):
    if (not doc_text or not doc_text.strip() or min_doc_text_size> len(doc_text)):
        return None

    payload = {"data": {'text': doc_text}}
    try:
        resp = requests.post(fds_themes_svc_url, json=payload, auth=HTTPBasicAuth('username-serial', 'api-key'))
        if not resp:
            status_code = resp.status_code if (resp is not None) else -1
            raise ValueError(f'Received unexpected response from service: status_code: {status_code}')
        result_id = resp.json()["data"]["id"]

        while(True):
            resp = requests.get(
                f"{fds_themes_svc_url}/{result_id}", auth=auth)
            if not resp:
                status_code = resp.status_code if (resp is not None) else -1
                raise ValueError(
                    f'Received unexpected response from service: status_code: {status_code}')
            if resp.json() != 'Processing':
                return resp.json()
            time.sleep(1)
    except Exception as ex:
        #print(str(ex))   # debug only
        raise ex



Named Entity Recognition Sample
import requests

from requests.auth import HTTPBasicAuth

def extract_from_text(doc_text,
                      min_doc_text_size= 10,
                      fds_ner_svc_url= ‘https://api.factset.com/cognitive/nlp/v1/ner/entities’):
    if (not doc_text or not doc_text.strip() or min_doc_text_size> len(doc_text)):
        return None

    payload = {
        'data':{
            'text': doc_text
        }
    }
    response_json = None
    try:
        resp = requests.post(anna_ner_svc_url, json=payload, auth=HTTPBasicAuth('username-serial', 'api-key'))
        if not resp:
            status_code = resp.status_code if (resp is not None) else -1
            raise ValueError(f'Received unexpected response from service: status_code: {status_code}')
        response_json = resp.json()
    except Exception as ex:
        raise ex

    if (not response_json or not isinstance(response_json, dict)
        or ('errors' in response_json) or ('data' not in response_json)):
        return None
    return response_json['data']['entities']
Question & Answer Sample
import time
 
import requests
from requests.auth import HTTPBasicAuth
 
auth = HTTPBasicAuth(‘user-name’, '<password>')
 
def extract_from_text(doc_text,
                      fds_qna_url='https://api.factset.com/cognitive/nlp/v1/qna/answers'):
 
    payload = {'text': doc_text}
    resp = requests.post(f"{fds_qna_url}/summary",
                         json=payload, auth=auth)
    if not resp:
        status_code = resp.status_code if (resp is not None) else -1
        raise ValueError(
            f'Received unexpected response from service: status_code: {status_code}')
 
    result_id = resp.json()["data"]["id"]
 
    while(True):
        resp = requests.get(
            f"{fds_qna_url}/{result_id}", auth=auth)
        if not resp:
            status_code = resp.status_code if (resp is not None) else -1
            raise ValueError(
                f'Received unexpected response from service: status_code: {status_code}')
        if resp.json() != 'Processing':
            return resp.json()
        time.sleep(1)
Changelog

1.4

Summary

Version 1.4 - AI Themes Sentiment Added 7/21/2023

Functionality Additions

  • AI Themes
    • "includeSentiments" parameter added to extract sentiments from themes.
    • ID and Status endpoints added.

1.3

Summary

Version 1.3 - Question & Answer Added 1/27/2023

Functionality Additions

  • Question & Answer
    • Supports English language text.
    • Input should be in plain text.
    • Minimum input length is 100 characters.
    • Maximum input length is 10,000 characters per request. Any text beyond the maximum length will not be considered for output.
    • The first request made to the Q&A API may take 10-15 seconds longer to generate a response than subsequent requests as the service needs to start up.

1.2

Summary

Version 1.2 - Named Entity Recognition Added 11/7/2022

Functionality Additions

  • Named Entity Recognition - originally released separately from the NLP library
    • Supports English Language text
    • Detect entities (people/places/organizations etc.) within the text

1.1

Summary

Version 1.1 – AI Text Summarization Added 9/6/2022

Functionality Additions

  • AI Text Summarization – Originally Released 03/31/2022
    • Supports English language text.
    • Input should be in plain text.
    • Minimum input length is 100 words.
    • Maximum input length is 1,024 words. Any text beyond the maximum length will not be considered for summarized output.

1.0

Summary

Version 1.0 – Released AI Themes 6/1/2022

Functionality Additions

  • AI Themes
    • Supports English language text.
    • Input should be in plain text.
    • Minimum input length is 100 characters.
    • Maximum input length is 15,000 characters per request. Any text beyond the maximum length will not be considered for output.
Sample Use Case

AI Text Summarization

  • Reduce large bodies of text: Quickly consolidate information focusing on document highlights
  • Enhance prioritization and focus: Quickly decide document priority based on the main ideas in a body of text
  • Populate News Applications: Developers and engineers can use Summarization to create headlines and story snippets to populate news applications

AI Themes

  • This service is used inside FactSet to extract themes from earnings call transcripts
  • This service can be used on any unstructured text such as news stories, company filings, or research reports
  • When used on multiple historical versions of a document, this API can be used to differentiate themes over time

Named Entity Recognition

  • When filtering content before sending it to analysts at your firm, use the NER service to identify people, companies, and geographic locations specific to user needs and interests.
  • You can use NER to enrich unstructured text from filings or other documents before storing it in your firm’s internal data stores.
  • Media firms can use NER to identify companies, people, and geographic locations on web pages during the editorial process, and then create hyperlinks from those entities to other relevant data.
  • You can use NER to tag companies mentioned in tweets, press releases, or social media posts for your own data gathering or sentiment analysis.
  • Within FactSet, when you create a research note using e-mail or the IRN Chrome extension, NER identifies which company the note is about.

Question & Answer

  • This service can be used to find the same type of information across multiple documents. For example, you may query several documents with questions such as "what are new products?", "what are new plant-based products?", or "what does this company make?"
  • This service can be used on any unstructured text such as news stories, company filings, or research reports