- SECTIONS
The Natural Language Processing API is a collection of services that analyze and add value to unstructured text. These APIs leverage Artificial Intelligence, Machine Learning, Deep Learning, Natural Language Processing, and Natural Language Generation techniques.
AI Text Summarization
The AI Text Summarization service takes text as an input and returns a computer-generated summary that represents the most important information in the original content.
AI Themes
The AI Themes service identifies significant concepts in unstructured text. AI Themes can be used to quickly understand the contents of a document, as a first step in determining the sentiment of a document, or to discover when new ideas emerge. Every theme is paired with a numeric value to indicate its significance within the document, and can additionally be paired with a sentiment score.
Named Entity Recognition
The FactSet NER (Named Entity Recognition) service identifies companies, people, locations, health conditions, drug names, numbers, monetary values, and dates from unstructured or semi-structured documents. In addition to providing the text and type of the entity names, along with their start and end offsets in the document, this service also provides the best matching FactSet identifiers for companies and people found in the text. This unique FactSet identifier allows you to link any document with other FactSet content sets, such as historical prices or fundamental data.
Question & Answer
The Question & Answer API is a machine learning service that provides the ability to ask questions about the information contained in text documents. The service will return answer(s) based on the information contained in the text.
- AI Themes.pdf449.27 KB
- AI Text Summarization.pdf432.20 KB
- Named Entity Recognition.pdf442.48 KB
- Q&A.pdf389.48 KB
import time
import requests
from requests.auth import HTTPBasicAuth
auth = HTTPBasicAuth(‘user-name’, '<password>')
def extract_from_text(doc_text,
fds_summarization_url='https://api.factset.com/cognitive/nlp/v1/summarization'):
payload = {'text': doc_text}
resp = requests.post(f"{fds_summarization_url}/summary",
json=payload, auth=auth)
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(
f'Received unexpected response from service: status_code: {status_code}')
result_id = resp.json()
while(True):
resp = requests.get(
f"{fds_summarization_url}/result/{result_id}", auth=auth)
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(
f'Received unexpected response from service: status_code: {status_code}')
if resp.json() != 'Processing':
return resp.json()
time.sleep(1)
from requests.auth import HTTPBasicAuth
def extract_from_text(doc_text,
min_doc_text_size= 10,
fds_themes_svc_url= ‘https://api.factset.com/cognitive/nlp/v1/themes’):
if (not doc_text or not doc_text.strip() or min_doc_text_size> len(doc_text)):
return None
payload = {"data": {'text': doc_text}}
try:
resp = requests.post(fds_themes_svc_url, json=payload, auth=HTTPBasicAuth('username-serial', 'api-key'))
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(f'Received unexpected response from service: status_code: {status_code}')
result_id = resp.json()["data"]["id"]
while(True):
resp = requests.get(
f"{fds_themes_svc_url}/{result_id}", auth=auth)
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(
f'Received unexpected response from service: status_code: {status_code}')
if resp.json() != 'Processing':
return resp.json()
time.sleep(1)
except Exception as ex:
#print(str(ex)) # debug only
raise ex
import requests
from requests.auth import HTTPBasicAuth
def extract_from_text(doc_text,
min_doc_text_size= 10,
fds_ner_svc_url= ‘https://api.factset.com/cognitive/nlp/v1/ner/entities’):
if (not doc_text or not doc_text.strip() or min_doc_text_size> len(doc_text)):
return None
payload = {
'data':{
'text': doc_text
}
}
response_json = None
try:
resp = requests.post(anna_ner_svc_url, json=payload, auth=HTTPBasicAuth('username-serial', 'api-key'))
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(f'Received unexpected response from service: status_code: {status_code}')
response_json = resp.json()
except Exception as ex:
raise ex
if (not response_json or not isinstance(response_json, dict)
or ('errors' in response_json) or ('data' not in response_json)):
return None
return response_json['data']['entities']
import time
import requests
from requests.auth import HTTPBasicAuth
auth = HTTPBasicAuth(‘user-name’, '<password>')
def extract_from_text(doc_text,
fds_qna_url='https://api.factset.com/cognitive/nlp/v1/qna/answers'):
payload = {'text': doc_text}
resp = requests.post(f"{fds_qna_url}/summary",
json=payload, auth=auth)
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(
f'Received unexpected response from service: status_code: {status_code}')
result_id = resp.json()["data"]["id"]
while(True):
resp = requests.get(
f"{fds_qna_url}/{result_id}", auth=auth)
if not resp:
status_code = resp.status_code if (resp is not None) else -1
raise ValueError(
f'Received unexpected response from service: status_code: {status_code}')
if resp.json() != 'Processing':
return resp.json()
time.sleep(1)
1.4
Summary
Version 1.4 - AI Themes Sentiment Added 7/21/2023
Functionality Additions
- AI Themes
- "includeSentiments" parameter added to extract sentiments from themes.
- ID and Status endpoints added.
1.3
Summary
Version 1.3 - Question & Answer Added 1/27/2023
Functionality Additions
- Question & Answer
- Supports English language text.
- Input should be in plain text.
- Minimum input length is 100 characters.
- Maximum input length is 10,000 characters per request. Any text beyond the maximum length will not be considered for output.
- The first request made to the Q&A API may take 10-15 seconds longer to generate a response than subsequent requests as the service needs to start up.
1.2
Summary
Version 1.2 - Named Entity Recognition Added 11/7/2022
Functionality Additions
- Named Entity Recognition - originally released separately from the NLP library
- Supports English Language text
- Detect entities (people/places/organizations etc.) within the text
1.1
Summary
Version 1.1 – AI Text Summarization Added 9/6/2022
Functionality Additions
- AI Text Summarization – Originally Released 03/31/2022
- Supports English language text.
- Input should be in plain text.
- Minimum input length is 100 words.
- Maximum input length is 1,024 words. Any text beyond the maximum length will not be considered for summarized output.
1.0
Summary
Version 1.0 – Released AI Themes 6/1/2022
Functionality Additions
- AI Themes
- Supports English language text.
- Input should be in plain text.
- Minimum input length is 100 characters.
- Maximum input length is 15,000 characters per request. Any text beyond the maximum length will not be considered for output.
AI Text Summarization
- Reduce large bodies of text: Quickly consolidate information focusing on document highlights
- Enhance prioritization and focus: Quickly decide document priority based on the main ideas in a body of text
- Populate News Applications: Developers and engineers can use Summarization to create headlines and story snippets to populate news applications
AI Themes
- This service is used inside FactSet to extract themes from earnings call transcripts
- This service can be used on any unstructured text such as news stories, company filings, or research reports
- When used on multiple historical versions of a document, this API can be used to differentiate themes over time
Named Entity Recognition
- When filtering content before sending it to analysts at your firm, use the NER service to identify people, companies, and geographic locations specific to user needs and interests.
- You can use NER to enrich unstructured text from filings or other documents before storing it in your firm’s internal data stores.
- Media firms can use NER to identify companies, people, and geographic locations on web pages during the editorial process, and then create hyperlinks from those entities to other relevant data.
- You can use NER to tag companies mentioned in tweets, press releases, or social media posts for your own data gathering or sentiment analysis.
- Within FactSet, when you create a research note using e-mail or the IRN Chrome extension, NER identifies which company the note is about.
Question & Answer
- This service can be used to find the same type of information across multiple documents. For example, you may query several documents with questions such as "what are new products?", "what are new plant-based products?", or "what does this company make?"
- This service can be used on any unstructured text such as news stories, company filings, or research reports