9 min read

GPT-3 Prompt Engineering and NLP For Financial Text

Prompt engineering is the process of designing the input text, or the prompt, that a model will use to generate its output. The quality and effectiveness of the output depends on the design of the prompt, so it is important to understand how to craft a good prompt for GPT-3.

In this tutorial, we will explore the basics of prompt engineering for GPT-3. We'll discuss how to perform common natural language processing tasks on financial text, including data extraction, named entity recognition, sentiment analysis, and text classification. Let's get started.

Data Extraction

Data extraction involves extracting structured data from unstructured text. In the previous tutorial, we used the OpenAI API to summarize an earnings call transcript. Let's now use prompting to extract key information from the earnings call.

To begin, let's test some prompts using the OpenAI Playground. To keep the input small, we'll run these prompts on the earnings transcript summary. If you want to process the entire transcript, you can use these prompts and the Python code we wrote in the last video to process the entire document.

Below is a copy of the Nvidia Earnings Call summary we produced in the last video:

Simona Jankowski welcomed everyone to NVIDIA's third quarter earnings call Colette Kress, the executive vice president and chief financial officer, reported that revenue was $5.93 billion, down 12% sequentially and 17% year-on-year. Data center revenue was up 1% sequentially and 31% year-on-year, driven by leading U.S. cloud providers and a broadening set of consumer Internet companies. Gaming revenue was down 23% sequentially and 51% year-on-year due to channel inventory corrections and challenging external conditions. The new Ada Lovelace GPU architecture had an exceptional launch, with the first RTX 4090 becoming available in mid-October.

Colette Kress, NVIDIA's CFO, stated that the sell-through rate for their gaming business was relatively solid in Q3 and expected to be stronger in Q4 due to the upcoming holidays and continued adoption of ADA. Jen-Hsun Huang, NVIDIA's CEO, added that their data center business is indexed to two fundamental dynamics: general purpose computing no longer scaling, and accelerated computing being recognized as the path forward.

NVIDIA's dynamic computing environment is driven by two main forces: general purpose computing and AI. General purpose computing is focused on power efficiency and cost efficiency, while AI is focused on productivity. NVIDIA is making excellent progress in its AI enterprise software stack, which is now available in the cloud and can be accessed through either a GPU instance hour or a software license. The company also took an inventory charge of $702 million in the quarter due to expected changes in demand for China data centers.

Jen-Hsun, I wanted to ask a question about your data center business and the growth outlook. You mentioned that you're seeing strong demand for accelerated computing and AI. Can you just talk a little bit more about what you're seeing in terms of the demand environment? And then, as it relates to the inventory charge, can you just talk a little bit more about what drove that decision? We are changing the way we report our data center business to better reflect the complexity of our customers and their needs. We will now be breaking out hyperscale and cloud purchases separately to provide more insight into the demand for our products. Additionally, we will be providing more color on large installations that we are seeing in the hyperscale space. NVIDIA reported strong Q4 2021 earnings, driven by strong demand for its GPUs. The company's CEO, Jen-Hsun Huang, discussed the strong adoption of NVIDIA GPUs among Internet service companies and cloud computing providers. He also noted that blockchain is not expected to be a major part of the company's business in the future. CFO Colette Kress discussed supply constraints and stock-based compensation.

Named Entity Recognition

The first task we'll perform is Named Entity Recognition. Named entity recognition involves identifying and extracting named entities, such as people, organizations, product names, and locations from text.

Prompts

Let's try some of the following prompts and observe their output in the OpenAI Playground:

  • Extract named entities from the following text:
  • Extract people, job titles, and product names from the following text:
  • Extract keywords from the following text:

Extracting Financial Information

The next task we'll perform is extracting key financial information and numbers. GPT-3 can extract amounts, dates, and transaction details from a text document.

Prompts

Let's try these prompts on the earnings transcript summary:

  • Extract all financial numbers from the following text:
  • Extract all financial numbers and what they represent from the following text:
  • Extract all dates and what they represent from the following text:

Prompting For Structured Output

While it is great that we can get a natural language response to our questions, GPT-3 understands how to format a response in various formats. For instance, it can return responses in bulleted lists, CSV files, or even JSON.

  • Extract people, job titles, and product names from the following text. Return the response as a bulleted list:
  • Extract people, job titles, and product names from the following text. Return the response in CSV format:
  • Extract people, job titles, and product names from the following text. Return the response in JSON format:

Giving Examples of Output Format

What if you don't like the response format that GPT-3 provides? What's even more powerful is that you can fine-tune the JSON output by giving an example of the output format. If we can return numerical data, keywords, and other information in a structured format -- this is very powerful. Think about it, we can use this to build a database or use GPT-3 as part of a more sophisticated API or commercial product.

Prompts

  • Put the following example output at the end of the extraction prompt. GPT-3 is able to correctly return a list of person names and job titles as JSON objects.
{
   "response": {
        "people": [{"name": "", "job_title": ""}],
        "products": []
   }
}
  • Extract people and product names from the following text. Return a SQL database structure for storing this data:
  • Extract people and product names from the following text. Return a SQL INSERT statements to populate a database with this information:

Sentiment analysis

In the second video of this series, we used OpenAI Whisper to perform speech recognition. We transcribed the Fed Speech and discussed how the language in the speech appeared to have an impact on short term price movements. Naturally many people were interested in how we might further analyze this text.

A common NLP task is Sentiment analysis, which involves analyzing the sentiment, or emotion, expressed in text. For example, determining whether a product review is positive or negative. Let's see what GPT-3 thinks about the sentences of the Fed Speech transcript:

Good afternoon. My colleagues and I are strongly committed to bringing inflation back down to our 2% goal. We have both the tools that we need and the resolve it will take to restore price stability on behalf of American families and businesses. Price stability is the responsibility of the Federal Reserve and serves as the bedrock of our economy. Without price stability, the economy does not work for anyone. In particular, without price stability, we will not achieve a sustained period of strong labor market conditions that benefit all. Today, the FOMC raised our policy interest rate by 75 basis points and we continue to anticipate that ongoing increases will be appropriate. We are moving our policy stance purposefully to a level that will be sufficiently restrictive to return inflation to 2%. In addition, we are continuing the process of significantly reducing the size of our balance sheet. Restoring price stability will likely require maintaining a restrictive stance of policy for some time. I will have more to say about today's monetary policy actions after briefly reviewing economic developments. The U.S. economy has slowed significantly from last year's rapid pace. Real GDP rose at a pace of 2.6% last quarter but is unchanged so far this year. Recent indicators point to modest growth of spending and production this quarter. Growth in consumer spending has slowed from last year's rapid pace in part reflecting lower real disposable income and tighter financial conditions. Activity in the housing sector has weakened significantly, largely reflecting higher mortgage rates. Higher interest rates and slower output growth also appear to be weighing on business fixed investment. Despite the slowdown in growth, the labor market remains extremely tight with the unemployment rate at a 50-year low. Job vacancies still very high and wage growth elevated. Job gains have been robust with employment rising by an average of 289,000 jobs per month over August and September. Although job vacancies have moved below their highs and the pace of job gains has slowed from earlier in the year, the labor market continues to be out of balance, with demand substantially exceeding the supply of available workers. The labor force participation rate is little changed since the beginning of the year. Inflation remains well above our longer run goal of 2%. Over the 12 months ending in September, total PCE prices rose at 6.2%, excluding the volatile food and energy categories, core PCE prices rose at 5.1%. The recent inflation data again have come in higher than expected. Price pressures remained evident across a broad range of goods and services. Russia's war against Ukraine has boosted prices for energy and food and has created additional upward pressure on inflation. Despite elevated inflation, longer term inflation expectations appear to remain well anchored, as reflected in a broad range of surveys of households, businesses, and forecasters, as well as measures from financial markets. That is not grounds for complacency. The longer the current amount of high inflation continues, the greater the chance that expectations of higher inflation will become entrenched. The Fed's monetary policy actions are guided by our mandate to promote maximum employment and stable prices for the American people. My colleagues and I are acutely aware that high inflation imposes significant hardship and is at erode's purchasing power, especially for those least able to meet the higher costs of essentials like food, housing, and transportation. We are highly attentive to the risks that high inflation poses to both sides of our mandate, and we're strongly committed to returning inflation to our 2% objective. At today's meeting, the committee raised the target range for the federal funds rate by 75 basis points, and we are continuing the process of significantly reducing the size of our balance sheet, which plays an important role in firming the stance of monetary policy. With today's action, we've raised interest rates by 3 and 3-quarters percentage points this year. We anticipate that ongoing increases in the target range for the federal funds rate will be appropriate in order to attain a stance of monetary policy that is sufficiently restrictive to return inflation to 2% over time. Financial conditions have tightened significantly in response to our policy actions, and we are seeing the effects on demand in the most interest rate sensitive sectors of the economy, such as housing. It will take time, however, for the full effects of monetary restraint to be realized, especially on inflation. That's why we say in our statement that in determining the pace of future increases in the target range, we will take into account the cumulative tightening of monetary policy and the lags with which monetary policy affects economic activity and inflation. At some point, as I've said in the last two press conferences, it will become appropriate to slow the pace of increases, as we approach the level of interest rates that will be sufficiently restrictive to bring inflation down to our 2% goal. There is significant uncertainty around that level of interest rates. Even so, we still have some ways to go, and incoming data since our last meeting suggests that the ultimate level of interest rates will be higher than previously expected. Our decisions will depend on the totality of incoming data and their implications for the outlook for economic activity and inflation. We will continue to make our decisions meeting by being and communicate our thinking as clearly as possible. We're taking forceful steps to moderate demand so that it comes into better alignment with supply. Our overarching focus is using our tools to bring inflation back down to our 2% goal and to keep longer term inflation expectations well anchored. Reducing inflation is likely to require a sustained period of below trend growth and some softening of labor market conditions. Restoring price stability is essential to set the stage for achieving maximum employment and stable prices in the longer run. The historical record caution strongly against prematurely loosening policy. We will stay the course until the job is done. To conclude, we understand that our actions affect communities, families and businesses across the country. Everything we do is in service to our public mission. We at the Fed will do everything we can to achieve our maximum employment and price stability goals. Thank you and I look forward to your questions.

Prompts

  • Analyze the sentiment of the following text:
  • Return a list of the sentiment of each sentence in the following text:
[ the text ]
sentence number|sentence text|sentiment

Text Classification

One commenter said the transcription is useless without being able to measure the emotion present in the audio. They wanted to know how confident the speaker was. I personally don't find Jerome Powell's speech to be very emotional, he seems pretty monotone to me. But that is a good point – some people have suggested that Jerome Powell may be bluffing - you'll notice when he goes off script in a more candid interview on November 30, he is much more dovish.

However, perhaps we can go beyond basic positive-negative sentiment and classify the language more specifically. For instance, can GPT-3 determine if the language is dovish, hawkish, or neutral?

Prompts

  • Classify each sentence in the following text as neutral or hawkish. Return the hawkish sentences in JSON format:
{   
     "response": [{       
         "sentence_text": "",       
         "classification": ""    
     }]  
}

Pretty nice! We have only scratched the surface of what is possible with OpenAI and GPT-3, but hopefully you can begin to imagine the possibilities here. We are starting with small building blocks, but understanding each of these simple concepts will allow us to compose more complex applications and perform more sophisticated tasks.