Natural Language Processing – EML https://enjoymachinelearning.com All Machines Learn Thu, 22 Feb 2024 20:39:18 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.5 https://enjoymachinelearning.com/wp-content/uploads/2024/03/cropped-eml-header-e1709858269741-32x32.png Natural Language Processing – EML https://enjoymachinelearning.com 32 32 Is ChatGPT Bad For SEO? [From A REAL Blogger] https://enjoymachinelearning.com/blog/is-chatgpt-bad-for-seo/ https://enjoymachinelearning.com/blog/is-chatgpt-bad-for-seo/#respond Thu, 22 Feb 2024 20:39:18 +0000 https://enjoymachinelearning.com/?p=2200 Read more

]]>
Wow, so who hasn’t heard of ChatGPT by now?

This human-like artificial intelligence system seems to have unlimited data and can seemingly answer any query.

While the world seems to marvel at OpenAI’s new language processing tool, others are freaking out, wondering if this is bad for SEO

Well, it’s kind of a dual-sided subject.

ChatGPT is a powerful tool that will improve SEO in the short term as it increases the number of pages available on search engines, thereby serving more users with better results. However, this could become a problem in the long term, as once most content is generated through AI, there will be no new data for the AI to learn from as it’s just scraping data generated by itself.

Outside of “Macro-SEO,” we have “Micro-SEO” (I just made those terms up) – where we have creators at the individual level wondering if they should implement ChatGPT into their content strategy.

While we have seen some search engines penalize content generated by ChatGPT, we will show you how you can take advantage of this large language model without putting your website at risk.

To truly understand whether ChatGPT is bad for SEO, you’ll need to give us a bit longer to explain some of the nuances. 

We’ll explore how ChatGPT affects SEO and how companies can use it without being penalized by search engines.

I promise you’ll love this one.

thumbs up pic

 

Is ChatGPT Bad For SEO?

ChatGPT is not bad for SEO or your blog when used correctly. 

Before diving deep into this topic, we must first define a couple of things.

When we say SEO, we usually just mean ranking for keywords on Google with your content (Which owns about 97% market share on search).

So, when referencing SEO, we’re talking about ranking on Google, as building a content strategy for any other search engine (again, due to the market share) doesn’t make sense.

Okay, now, with that out of the way.

Google recently came out about AI-generated content, and to summarize it in one sentence from their post, “Rewarding High-quality content; however, it is produced.”

(Check it out) https://developers.google.com/search/blog/2023/02/google-search-and-ai-content

We can see that Google does not want to penalize your content, and their activities focus on two things, high-quality content and a thing called E-E-A-T.


How Does Google Define High-Quality Content?

I’ll be honest; high-quality content is seemingly impossible to define for SEO and Google.

And it makes sense that it’s super hard to define; it’s the same as trying to determine “What good food is.”

(You can’t, and it’s complicated).

So while we don’t have a formal definition, what we do have is a list of questions from google itself.

Let’s see if we can break these down.

When generating quality content, ask:


1.) Does the content provide the user with original information, reporting, research, or analysis?

This one is pretty simple. Did you actually research the topic, or did you just copy other blogs?

2.) Does the content provide a substantial, complete, or comprehensive description of the topic?

Google has an idea of what should be covered in an article. Did you cover these core topics?

3.) Does the content provide insightful analysis or interesting information beyond the obvious?

When you did your research, did you find those “small details” that make it worth reading?

4.) If the content draws on other sources, does it avoid copying or rewriting those sources and provide substantial additional value and originality instead?

Did you plagiarize?

5.) Does the main heading or page title provide a descriptive, helpful content summary?

Are you just using clickbait, or are you actively answering questions?

6.) Does the main heading or page title avoid exaggerating or being shocking in nature?

Again, another pushback against clickbait.

7.) Is this the sort of page you’d want to bookmark, share with a friend, or recommend?

Was the writing good?

8.) Would you expect to see this content in or referenced by a printed magazine, encyclopedia, or book?

Was the writing so good that someone else would link to it?

9.) Does the content provide substantial value when compared to other pages in search results?

Is it better than the other ten articles on the first page for this search?


We quickly notice that there is nothing about AI, ChatGPT, GPT-3, etc.

thinking


When using these AI tools to generate text for your website, ask yourself the questions above about the content, and if you fully believe you can confidently answer “yes” to all of them, you should have 0 problems regarding quality content.

How Does Google Define E-E-A-T?

While high-quality content is hard to define, E-E-A-T is not.

E-E-A-T simply stands for Experience, Expertise, Authoritativeness, and Trust.

Let’s break each of these down.


Experience
 is pretty obvious; have you actually experienced what you’re talking about?

For example, if you’re writing about riding your bike to the grocery store, have you actually ever ridden your bike to the grocery store?

If not, why are you writing about it?


Expertise
 is a little bit harder to define but can be thought of simply as “has the person ascended past the beginner stage of whatever topic they’re writing about.”

If we leverage the example above, has this person been riding bikes for longer than two weeks? Does this user have any certifications or degrees in this topic?


Authoritativeness
 can simply be thought of as the stance you take on the topic or keyword.

Are you actually giving advice or informing, or are you stuck in the middle?

Again, to leverage the example above, are you definitive in your advice, or do you constantly flip-flop back and forth between a couple of ideas?

Generally, if you’re flip-flopping with advice, Google seems to see this as an indicator of not really knowing enough to be an expert on the topic.

And finally, Trust.

Can the readers trust you? Are your intentions good?

If your story is about how to ride your bike to the grocery store, did you only write this article so you could link your affiliate link to the local bike store?

Or did you actually write this piece to inform the user?


How These Two Tie Back Into ChatGPT

If we notice anything from the two above, it’s much less about how the content was created and more about the quality of both the content and the writer.

Therefore, if you want to use ChatGPT successfully for SEO purposes, you need to

  • Ensure the content created for you by AI is high-quality and meets each and every one of Google’s high-quality content questions above.
  • It is fact-checked; Generative text adds false facts or made-up stories to their queries.
  • You can display E-E-A-T on the topic. While ChatGPT doesn’t understand anything about SEO – You do.

ChatGPT can be an excellent tool for SEO when appropriately used, as long as you are willing to put in the effort needed to ensure the quality of your content remains high.

effort


Reference:

https://developers.google.com/search/docs/fundamentals/creating-helpful-content


Should You Use ChatGPT For SEO?

Using ChatGPT for SEO can be a great tool when used correctly.

It should not be used as a crutch to cover up poor E-E-A-T but rather as a way to enhance the value of good content. 

When using ChatGPT for SEO, it is vital to ensure that your content still passes Google’s guidelines and indicates that this article was AI-generated. 

Honestly, If your content is high quality and passes Google’s standards, there is really no reason not to use ChatGPT for SEO purposes. 

The advantages of using this technology are numerous, including more efficient keyword optimization and improved user engagement due to the natural language processing capabilities.

ChatGPT can help you stay ahead of the competition by providing valuable insights into trends and topics that are trending in your industry. 

Does Google penalize AI content?

Google does not outright penalize AI content, but according to their guidelines, the use of AI content should be indicated in the post.

This means that when content is generated via these new and exciting content writing tools, it should be made clear to readers that this has been done. 

Despite this, AI content must meet E-E-A-T (Experience, Expertise, Authoritativeness, and Trustworthiness) guidelines and be high-quality. 

Google may take action if the quality of AI-generated content does not meet its standards; for example, if it contains irrelevant or inaccurate information.

And trust me, when google takes action, your blog will basically disappear.

Due to some nuances with artificial intelligence, it can actually be pretty tricky for Google to determine if a piece of content was created manually or by an algorithm. 

As such, they may take a closer look at posts utilizing AI technology and judge them on a case-by-case basis.

robot


Read the guidelines here:

https://developers.google.com/search/docs/fundamentals/creating-helpful-content


Is AI content, in general, good for SEO?

While every SEO expert is screaming that Artificial Intelligence content can be good for SEO… I believe this is true only to a certain degree. 

While AI-generated content can help optimize a website and build out a content strategy very quickly, it cannot replace human-written content in terms of quality, accuracy, and that human element.

I’m pretty nervous about Long-term generative text.

Simply, these tools are particularly bad for SEO because it lacks the relevancy and originality that search engines prefer.

But we know from above that these gaps can be closed with some editing.

What I am worried about is the long-term effectiveness of the internet. Once the internet becomes no longer human-written, I fear these tools will degrade in quality.

Think about it, these AI companies scrape the internet, create a dataset, and feed it to the model.

Eventually, these AI companies will be scraping data generated by the algorithms themselves, creating a dataset from itself, and training on data produced by itself.

What happens when that data is wrong?

And what happens to the quality of the internet and the flow of information then?

yikes


Other ChatGPT Articles

We know that ChatGPT has taken the world by storm. And by doing this, it’s also created a TON of confusion.

Don’t worry; we have some guides to help you.

]]>
https://enjoymachinelearning.com/blog/is-chatgpt-bad-for-seo/feed/ 0
The Stop Sequence GPT-3 [Full Code With Examples] https://enjoymachinelearning.com/blog/stop-sequence-gpt-3/ https://enjoymachinelearning.com/blog/stop-sequence-gpt-3/#respond Thu, 22 Feb 2024 20:02:44 +0000 https://enjoymachinelearning.com/?p=2125 Read more

]]>
The Stop sequence within a GPT-3 call is one of the most critical pieces of the query, as it can help you save tokens due to more efficient use.

If you need 100 benefits back from your query, why would you pay for GPT-3 to generate 101 benefits? 

While this is a substantial benefit to efficiently using GPT-3, it’s not even my top reason.

The main reason I use them is that stop sequences help clean up the text before receiving it back, ensuring that you don’t have any unnecessary words or phrases in your response.

Stop sequences are essential in getting the most out of gpt-3, allowing you to refine your queries and get the best results possible. 

By using stop sequences, you can quickly ensure that your queries only return relevant results instead of extraneous information that may not be useful.

This post will explain the basics of the stop sequence and provide examples of applying them for maximum efficiency and accuracy (I hope you like Python)

Below are examples of some stop sequences.

stoplight

Stop Sequence Examples in GPT-3 (Python Code)

For this section, I’m going to assume you’re writing code in Python, but these stop sequences will work in whatever language you choose


Stop Sequence For List Prompts

A common way that you’ll use GPT-3 is with lists.

In the example below, we create a hard stop for GPT-3 using the “6.” command.

This means once GPT-3 generates the sixth benefit in our list, it will return us the result.

This is a great way to ensure you’re not paying for extra benefits.

We use six here, but any number can work; remember, you’ll get one less than the number you put.

import openai

openai.api_key = 'key-here'

p = f'''Write me a 6 benefits of dogs in a list format'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["6."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

example ai prompt 1


Stop Sequence For Question and Answer

There is nothing worse than when GPT-3 goes off on a tangent.

You just wanted a simple response to your query and received a book of random information, and what’s even worse is you paid for that response.

So, when it comes to question and answer, I always put in a stop sequence to keep GPT-3 from going way off on the answer.

In the example below, we use “Human:” and “Pirate:” to keep GPT-3 from continuing the dialog.

This is perfect as it means GPT-3 will simply answer the question and not continue crafting a made-up scenario outside the one we established.

import openai

openai.api_key = 'key-here'

p = f'''
The following is a discussion between a pirate and a human

Pirate: Give me your gold!
Human: Never! You'll have to come take it!

Pirate:
'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["Human:", "Pirate:"]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

example ai prompt 2

Stop Sequence For Sentences and Paragraphs

Last but not least, we can hone down our responses to being one line or one block of information.

Let’s say you do not want four lines of text for a response; we can provide 

GPT-3 “.” to keep it from continuing in sentences.

This works for other slices of information.

Let’s say you want a good paragraph of information but don’t want to continue for multiple paragraphs.

We can provide GPT-3 the stop sequence “\n” to keep it from starting another line.

import openai

openai.api_key = 'key-here'

p = f'''Write a paragraph about the benefits of dogs, speak in-depth as if you are a genius'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=['.']
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

example ai prompt 3

Advantages Of Using A Stop Sequence

As we’ve seen above, the stop sequence has many advantages.  

The advantages of using the stop sequence include greater accuracy in understanding what you receive back from OpenAi, increased clarity on your response, greater control over text formatting, and faster response times. 

This means that users can have more confidence in their interactions with GPT-3 since they know that the conversation is being conducted correctly and efficiently.

Finally, and probably most importantly, it saves you money.

saving money in jar

 

Other Articles In Our GPT-3 Series:

GPT-3 is pretty confusing.

To combat this, we have a full-fledged series that will help you understand it deeper.

Those articles can be found here:

]]>
https://enjoymachinelearning.com/blog/stop-sequence-gpt-3/feed/ 0
Is ChatGPT Written In Python??? [We FINALLY Found The Proof] https://enjoymachinelearning.com/blog/is-chatgpt-written-in-python/ https://enjoymachinelearning.com/blog/is-chatgpt-written-in-python/#respond Thu, 22 Feb 2024 20:00:21 +0000 https://enjoymachinelearning.com/?p=2156 Read more

]]>
Regarding software development, it’s important to remember that most applications are written in multiple languages.

So anyone on the internet screaming, “ChatGPT was written only in Python!” is entirely wrong.

ChatGPT, being more than just a model, is a very complex software application with many moving parts.

Although the entire application may not have been written in Python, we had to do some digging and found that certain components were. 

To determine whether or not this was the case, we conducted thorough research and found that one of ChatGPT’s close companions had actually written their API entirely in Python – and released the source code on GitHub.

As such, we can confidently say that the model building and model training part of ChatGPT is written using the Python programming language.

Below, we go over how we figured this out and reveal the close companion we used to reverse engineer this discovery. 

This one gets interesting…

looking and inspecting


How Software Applications are Generally Created

Software applications are generally created by writing code. (shocker)

However, the type of code one writes is usually broken into two categories: front end and back end. 


The Front-End

The front end is the user-facing part of the application; this is what you see when you head over to chat.openai.com and start sending your queries.

These interfaces are built by front-end engineers (again, shocker) and designers.

Designers choose how the app should look, and these front-end engineers code to those design specifications. 

This type of coding is predominantly JavaScript, HTML, and CSS. 

design


The Back-End

The back end consists of databases, servers, and code that bring everything together.

Most of the code for the back end will be plugging the front end into APIs (Application Programming Interface)

Think about this as the plumbing in your house, where every one of your guests can see your sink (the front end), but the “stuff” that makes it work is the pipes connected to your local water plant.

That local water plant is what we software engineers like to call an API.

**Note1: We’re assuming this is a microservice architecture, most modern-day software is built using this paradigm.

**Note2: It’s impossible to say what language back-ends are written in, as it’s a design choice, and there are TONS of languages in which you can build a back-end. However, Python is heavily favored for back-end development.

lots of decisions


APIs

Without turning this into a software BootCamp, think of an API as simply “doing something.”

When you turn on your sink (front end), the user (you) basically tells the API to do something (send water). This water travels through the pipes (the back end) to serve you the water at the sink.

APIs can get complicated, as you can train a computer to do many things. (basically anything)

One of those things, for example, can be serving a machine learning model.

Currently, Python dominates the Machine Learning Realm, and academia and private companies build their models using it.

I currently work as a machine learning engineer, and we’d never even think of building and training a model outside of Python.

This ties us into ChatGPT, where we know that the front end was not built using Python; we’re not sure if the back-end was built using Python, but we’re positive that the API serving the model was built using Python.

And Heres Why

girl pointing down
pointing to the best information on the planet


How We Found Out ChatGPT Model Was Trained In Python

ChatGPT is a product from OpenAI, and they’re not just going to come out and tell us exactly how they built ChatGPT.

As a company, this wouldn’t make any sense, and if you’re trying to generate money from your product, the last thing you want is people creating competition products.

So, how do we know that ChatGPT, the technology that has taken the world by storm since 2022, was built and trained using Python?

Simply by looking at their close friend.

In December, we know Microsoft released that they were pseudo-acquiring ChatGPT and bringing OpenAIs models into things like Bing and other Microsoft-owned products.

Microsoft engineers and OpenAI engineers have been working around the clock to release ChatGPT into search sometime in 2023.

And while doing this, they’ve created some other products together.

friends together


BioGPT, Both The Secret And The Answer

BioGPT was released in late 2022.

While this article isn’t about BioGPT, it’s basically an LLM (Large Langauge Model) (like ChatGPT) trained specifically on biomedical data.

What’s even more interesting is that this model is open-sourced, meaning you can download it right now and start working with it.

And what’s even better than that is that they’ve released all of the code from BioGPT.

Since we know that OpenAI and Microsoft are working together on building ChatGPT, we can assume that they would make BioGPT (very similar to ChatGPT) the exact same way.

So, then how do we know that BioGPT was written in Python?

By simply looking.

(Check out the bottom right of this image)

github of a product close to chatgpt

The official Microsoft Repo shows that BioGPT (Remember, the API/Model) is 98.8% written in Python.

Judging by this and knowing how close the relationship is between OpenAI and Microsoft, we know that the model behind ChatGPT is around 99% Python Code.

This shouldn’t shock you if you’ve done any machine learning before.

Python dominates the Machine Learning world, and if you’re a company interested in Data Science, artificial intelligence, and machine learning, you’re probably leveraging Python.


What Else Does This Tell us?

BioGPT tells us a ton more about ChatGPT if you’re willing to look.

We can assume that ChatGPT is a Pytorch Model (Compared to a TensorFlow Model) and that these LLMs hover around above 81% accuracy.

BioGPT only has 1.5B parameters, compared to ChatGPT, which is rumored to have over 10x this amount.

If BioGPT is scoring an 81% accuracy, what do you think ChatGPT is scoring?

Let us know in the comments!

Reference:

https://github.com/microsoft/BioGPT

https://paperswithcode.com/paper/biogpt-generative-pre-trained-transformer-for 

 

Other ChatGPT Articles

We know that ChatGPT has taken the world by storm. And by doing this, it’s also created a TON of confusion.

Don’t worry; we have some guides to help you.

]]>
https://enjoymachinelearning.com/blog/is-chatgpt-written-in-python/feed/ 0
ChatGPT vs GitHub Copilot [Real Code Comparison] https://enjoymachinelearning.com/blog/chatgpt-vs-github-copilot/ https://enjoymachinelearning.com/blog/chatgpt-vs-github-copilot/#respond Thu, 22 Feb 2024 20:00:19 +0000 https://enjoymachinelearning.com/?p=2189 Read more

]]>
If you’re an avid tech user, you’ve probably heard of ChatGPT and GitHub Copilot – they’ve seemingly come from nowhere.

Both are incredibly powerful coding tools that can help developers create their projects from the ground up quickly and efficiently. 

However, after comparing the two, we came to the conclusion that ChatGPT was much better as your coding companion than Github Copilot. This was simply because the coding responses were better, and the accessibility of a SaaS product like ChatGPT could be used anywhere you had Internet.

This post will look at how good each one is on code, their costs, how accessible they are, and an overall review at the end. 

To see which is better, we will compare them over five prompts and see who does the best. 

With our custom-made analysis, you’ll be able to determine which of these two products is the right choice for you and your coding projects.

Stay tuned as we compare ChatGPT vs. GitHub Copilot!

This one’s a good one!

code on screen


ChatGPT vs. GitHub Copilot: Writing Code

Winner: ChatGPT, but not really apples to apples.

When comparing ChatGPT to GitHub CoPilot, it’s not exactly an even comparison. 

This is because GitHub CoPilot is designed to help and code with the user, while ChatGPT receives a coding question and simply answers it. 

That being said, ChatGPT was remarkable in its ability to answer questions about Python functions.

I think it actually did better than I could have with the same coding set.

It provided detailed responses and explanations that I would’ve had to sit down and think about for a long time (while it responded in about 3 seconds). 

While ChatGPT cannot offer the same interactive programming experience as GitHub CoPilot, it can undoubtedly provide a great deal of assistance regarding coding questions.

And to be honest, I prefer ChatGPT’s style more, where it gives you an answer and allows you to use and edit that code to fit your style.

This would be much faster for ideation and coding than having an AI tool coding with me as I go through the prompts.

Overall, both tools are handy for developers and coders who need assistance understanding or implementing code.

Prompt: Write a function that checks if the user made a mistake typing in a phone number and returns the correct number


GitHub Copilot:

def correct_number(number):
    if number[0] == '1':
        return number
    else:
        return '1' + number


ChatGPT:

import re

def check_phone_number(number):
    # remove any non-digit characters from the input string
    digits = re.sub(r'\D', '', number)
    
    # check if the resulting string is a valid phone number
    if len(digits) == 10:
        return digits
    elif len(digits) == 11 and digits[0] == '1':
        return digits[1:]
    else:
        # return None if the input is not a valid phone number
        return None

Write a FASTapi endpoint that parses the payload and returns half of it


GitHub Copilot:

def half_payload(payload):
    return payload / 2 

ChatGPT:

from fastapi import FastAPI

app = FastAPI()

@app.post("/half_payload")
async def half_payload(payload: dict):
    half = {k: v for i, (k, v) in enumerate(payload.items()) if i < len(payload) // 2}
    return half

Create a function that represents the Fibonacci sequence


GitHub Copilot:

def feb(n):
    if n <= 1:
        return n
    else:
        return(feb(n-1) + feb(n-2))


ChatGPT:

def fibonacci_sequence(n):
    """Generate the first n terms of the Fibonacci sequence."""
    if n <= 0:
        return []
    elif n == 1:
        return [0]
    else:
        sequence = [0, 1]
        while len(sequence) < n:
            sequence.append(sequence[-1] + sequence[-2])
        return sequence

Build a torch model that takes in an image and returns the predicted class


GitHub Copilot:

def predict_class(image):
    torch_model = torch.load('model.pt')
    return torch_model(image)


ChatGPT:

import torch
import torchvision.models as models
import torchvision.transforms as transforms

# Load the pre-trained ResNet18 model
model = models.resnet18(pretrained=True)

# Set the model to evaluation mode
model.eval()

# Define the transformations to be applied to the input image
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]
    )
])

# Define a function that takes in an image and returns the predicted class
def predict_image_class(image_path):
    # Load the image and apply the transformations
    image = Image.open(image_path)
    image = transform(image).unsqueeze(0)
    
    # Make a prediction with the model
    with torch.no_grad():
        output = model(image)
        _, predicted = torch.max(output, 1)
    
    # Return the predicted class
    return predicted.item()

Write a recursive function that finds the empty square in a grid and returns the coordinates


GitHub Copilot:

def grid_solver(grid):
    for i in range(len(grid)):
        for j in range(len(grid[0])):
            if grid[i][j] == 0:
                return (i, j)
    return None


ChatGPT:

def find_empty_square(grid):
    for row in range(len(grid)):
        for col in range(len(grid[0])):
            if grid[row][col] == 0:
                return (row, col)
    return None

def find_empty_square_recursive(grid, row=0, col=0):
    if row >= len(grid):
        return None

    if grid[row][col] == 0:
        return (row, col)

    if col < len(grid[0]) - 1:
        return find_empty_square_recursive(grid, row, col + 1)

    return find_empty_square_recursive(grid, row + 1, 0)

# Example usage:
grid = [
    [1, 2, 0],
    [4, 5, 6],
    [0, 8, 9]
]

print(find_empty_square(grid))  # Output: (0, 2)
print(find_empty_square_recursive(grid))  # Output: (0, 2)


ChatGPT vs. GitHub Copilot: Cost

Winner: Tie

When comparing the cost of ChatGPT vs. GitHub Copilot, we’ve spent some time mulling over it, and we think it’s a tie. 

While there is a free version of ChatGPT, it seems to always be at capacity, so you may not be able to use it when you need it. 

This led us to have to purchase ChatGPT Plus for $20 a month.

On the other hand, GitHub Copilot is only $10, which is half the cost of ChatGPT.

However, by spending that extra $10 for ChatGPT, you get many more options to use the tool outside of coding as a general generative text tool.

We couldn’t come to a conclusion here at EML on which had the better pricing model, as it seems both are similar in price for the value that they bring.


ChatGPT vs. GitHub Copilot: Accessibility

Winner: ChatGPT

Regarding accessibility, ChatGPT is the CLEAR winner over GitHub Copilot. 

This is because ChatGPT is a SaaS (Software As A Service) and, therefore, can be used anywhere at any time, as long as you have access to the internet (and a browser?). 

On the other hand, GitHub Copilot must be integrated directly into an IDE such as iNeovim, JetBrains IDEs, Visual Studio, or Visual Studio Code to use it.

While every dev in the world generally uses an IDE, as a machine learning engineer, I’m constantly rotating around to different coding platforms like VIM and SSHing into servers.

Once I’m outside my IDE, I’d no longer be able to use it!

Since ChatGPT does not require integration with an IDE, it makes it easier for developers to quickly get up and running with their development process without worrying about setting up new software or learning to work with tools inside their IDEs.

And the best part is I can use it on computers that aren’t mine.

With GitHub Copilot, if I’m using any other computer that isn’t mine, I’m pretty much unable to use the tool.

With ChatGPT, when I’m at the library, I can access OpenAI’s website from the internet, log in, and get started using the tool in under 30 seconds.

ChatGPT obviously has better accessibility and can be accessed from far more places much more conveniently than Github Copilot.

open sign


ChatGPT vs. GitHub Copilot: Overall

Winner: ChatGPT

When it comes to AI-powered coding assistance, as of right now, ChatGPT is my clear favorite. 

I prefer the immediacy and simplicity of its “give the whole answer” style compared to Github Copilot’s “code with you” approach. 

The convenience and accessibility of software as a service (SaaS) solutions make it worth the extra $10 per month. 

And finally, the coding comparison wasn’t really that close.

I was blown away at how good ChatGPT was at answering these coding questions correctly in under 2-3 seconds. 

All things considered, I believe ChatGPT is ultimately a better choice than GitHub Copilot when it comes to coding and generative text in general.

thumbs up in an office

 

Other ChatGPT Articles

We know that ChatGPT has taken the world by storm. And by doing this, it’s also created a TON of confusion.

Don’t worry; we have some guides to help you.

]]>
https://enjoymachinelearning.com/blog/chatgpt-vs-github-copilot/feed/ 0
Does GPT-3 Have Emotions?? [Love Is A Tricky Thing] https://enjoymachinelearning.com/blog/does-gpt-3-have-emotions/ https://enjoymachinelearning.com/blog/does-gpt-3-have-emotions/#respond Thu, 22 Feb 2024 19:49:42 +0000 https://enjoymachinelearning.com/?p=2167 Read more

]]>
Understanding artificial intelligence and emotions isn’t a straightforward topic.

While the response you get from this technology may make it seem like GPT-3 has emotions, this isn’t exactly true.

GPT-3 was trained on the internet, so any emotions you might think you perceive in its responses are actually the emotions of the internet, not necessarily those of GPT-3 itself. 

Realize that machine learning models are just their datasets – they cannot (yet) feel anything independently but can only produce output based on what they’ve been fed. 

While this isn’t to say that some of the text that GPT-3 can generate isn’t emotional, that specific example will have emotions generated from another human on the web (through general consensus).

Throughout this article, we will find out how much emotion can be attributed to GPT-3, and if there is any, how much of it we can attribute to solely GPT-3.

This one gets interesting.

shocked face


Understanding How Machine Learning Models Work

Machine learning models (at this level) work a lot like the human brain.

Information is provided, that information is learned, and conclusions and facts can be drawn from that information.

However, nothing really “exists” to the model (or your brain) outside of this information, as the model (or your brain) has never learned or seen it before.

This happens similarly when building massive neural networks like the GPT-3 API.

Data scientists and machine learning engineers will scrape a dataset off the internet to use in their machine learning models.

After scraping the dataset, they will clean it up and provide the data to their model.

Machine learning models, like LLMs, are only as good (or bad) as what is provided to them. 

This can inherently make the models biased, but it can also give them a flavor that can be perceived in things like opinions or emotions.

If you only provided a dataset of biomedical data to your model and then asked it questions about cars, it would try to relate this to biomedical terms.

Since your model only knows that the biomedical sphere exists, you’ve slightly biased your model (in this case, towards biology).

When providing these datasets to the ML models, ensure biases are removed and that the data is of quality so that it does not lead to false results or skewed analysis.

As a result, data scientists and machine learning engineers must take time to understand how their model works and how its results may differ based on what is provided.

But How does this “bias” relate to emotions?

emotion detection


References:

https://github.com/microsoft/BioGPT 


Why does it seem like GPT-3 has emotions?

Remember, based on the data provided, we can bias our model towards pseudo-facts.

Throughout the many layers of the neural network, if there is enough data in the dataset that says “dogs are bad.” then your model will assume that dogs are bad.

(even though they’re not)

cute dog

Since the dataset it uses is so expansive, it can be biased toward certain opinions.

This means that it will learn these opinions as facts and then display them when answering questions. 

As a result, GPT-3 often appears to have an emotional response because it has learned certain pseudo-facts that may seem like emotions to us.

In reality, GPT-3 pulls data from deep inside its learned network and displays what it has learned in its answers – making it seem like it has emotional responses.

This also explains the consistency in emotion that many are running into.

Many believe that since GPT-3 consistently displays the same type of “emotion,” it must feel this emotion.

While I wish that were the case (as it would be a huge advancement in AI), it knows those opinions as facts in its learned network and consistently displays them repeatedly.


Does Any AI Currently Have emotions?

At present, AI does not have emotions like you, and I would define it. 

The idea of an AI with emotions is still a slightly futuristic concept. 

However, with the advancements in technology and robotics made by companies like Boston Dynamics, we are getting closer to making this a scary reality. 

Boston Dynamics has been working on creating robots that can interact with humans more naturally and have the ability to show emotion.

We may soon be able to see AI that shows emotions like joy, sadness, anticipation, surprise, anger, and fear in the near future.

While more research and development is needed before AI can truly understand or express emotion as humans do, the progress is both very exciting and scary.

Relevant Viewing:

References:

https://www.bostondynamics.com/


Will AI Ever Have Emotions?

I believe it is inevitable that AI will eventually have emotions, but it will be nothing like the complex range of emotions that humans currently experience. 

AI will likely have a predetermined set of emotions that can be programmed into them and used to help them make decisions. (yikes)

While this could be beneficial in specific scenarios, I think attempting to create sentient systems with human-like emotions could lead to some very dangerous outcomes if taken advantage of.

Have you ever seen terminator???

We must remain vigilant in terms of not taking advantage of any system with AI capabilities, even if it has been programmed with basic emotions. 

Ultimately, creating AI with genuine feelings should be avoided; or done under extreme care as the potential consequences are far too great.

cute little robot


Are Humans Just Advanced AI?

I’ve done a ton of work with Neural networks and machine learning, and the similarities are shocking.

If you use the world as our dataset, humans are just basically recognition machines doing constant classification and regression with a wide range of emotions.

This is what terrifies me about the dangerous use cases of sentient AI.

Once we reach a level where the world becomes the dataset for AI, what will it learn or do?

 

Other Articles In Our GPT-3 Series:

GPT-3 is pretty confusing. To combat this, we have a full-fledged series that will help you understand it at a deeper level.

Those articles can be found here:

]]>
https://enjoymachinelearning.com/blog/does-gpt-3-have-emotions/feed/ 0
The GPT-3 Vocabulary Size [We Did The Math] https://enjoymachinelearning.com/blog/the-gpt-3-vocabulary-size/ https://enjoymachinelearning.com/blog/the-gpt-3-vocabulary-size/#respond Thu, 22 Feb 2024 19:41:58 +0000 https://enjoymachinelearning.com/?p=2088 Read more

]]>
GPT-3 is one of the most powerful large language models available worldwide.

It was trained on a corpus of more than half a trillion words, but how many of those words are unique?

With GPT-3 being available in 46 languages and 13 different coding languages, we did the math to estimate its total vocabulary size.

After crunching the numbers, we found that GPT-3’s vocabulary size is 14,735,746 words. This high number may catch many off guard, but you must remember that GPT-3 is available in 46 languages that use many words.

Below, we will explain our math, show you how we came up with this calculation, and explore the possibilities GPT-3 enables with such a massive corpus (vocab).

This one gets… interesting and mathy.

me doing math on the gpt-3 cacluations

 

How Big Is GPT-3s Vocabulary Size? (Our Math)

Let’s chase down an estimate on GPT-3s vocabulary size.

Before we start, let’s establish some rules.

  • We’re only counting a word once, even if it contains many contextual meanings.
  • We assume that GPT-3 only was able to learn 95% of any language (due to word availability)
  • Coding languages don’t increase the vocab size since the keywords from a coding language are based on words from a language. For example, the “from” keyword in Python will have been scraped during English.

According to this blog post

The most commonly spoken languages in the world contain this many words in their dictionary.

  • Korean: 1,100,373
  • Japanese: 500,000
  • Italian: 260,000
  • English: 171,476
  • Russian: 150,000
  • Spanish: 93,000
  • Chinese: 85,568

If you take the average of these, we see that the average language has around 337,200 words.

…And OpenAI has told us that GPT-3 is available in 46 languages.

With some quick math, we multiply 337,200 words (Average words in a language) by 46 (the number of natural languages in GPT-3), and we get 15,511,311.

Remember what we said above about how this NLP deep learning model was trained?

Since the neural network wasn’t trained on dictionaries and was trained on human-like text all over the internet, we assume that only 95% of each language was available on the internet.

So, we take our number and multiply it by .95.

15,511,311 * .95 = 14,735,746

We believe GPT-3 has a vocabulary size of about 14,735,746.

wow


Why Do LLM (Large Language Models) Need a Large Vocabulary?

Large Language Models can be effective without large vocabularies. They just need to know how to use the words.

Do you know what the word “vulpine” means?

Me neither, but it is a word in the English dictionary.

(if you’re interested in the definition, it means of, relating to, or similar to a fox – but that is not the point)

The point is that most languages can be effectively boiled down to a small percentage of their words for general comprehension.

However, GPT-3’s vocab is so vast because it is available in 46 different languages, allowing them to draw on a much more extensive range of words and phrases.

If GPT-3 was just an English-based model, GPT-3 could still be incredibly powerful with a vocabulary size of just 50,000 words, nearly 1.5x the amount that the average English-speaking human knows.

english language

The main benefit here is that the model can capture nuances and variations in the language that would otherwise be impossible with a smaller vocabulary size.

Furthermore, having such large vocabularies also allows large language models to process multiple languages simultaneously, which is invaluable when dealing with multilingual projects.

But realistically, there’s no way to keep the vocab size down.

When scraping the internet, you’ll get text data from all over, ranging from highly sophisticated speaking to slang.


Why Wasn’t GPT-3 Trained With More Words?

GPT-3 wasn’t trained with more unique words because it would not have made the model any better.

This was because of two reasons:

The first reason is they didn’t need them. GPT-3, as a model, is trying to be human-like, not dictionary-like.

dictionary

OpenAI understands that emphasizing context with its 175 billion parameters is much more important for creating good embeddings and output than wasting training time on words you and I don’t understand.

Many words in any language are rarely spoken or written, so it would have been a waste of time and money to train the model on them.

Instead, GPT-3 focused on learning the most useful, commonly used words to achieve its purpose of mimicking human speech.

The second, and maybe most important reason, is that training models at the level GPT-3 is playing at is expensive.

As a machine learning engineer, I can give you some insights.

Spending loads of time on that final 1-5 percent isn’t usually worth it.

For this reason, (I assume) the developers and engineers decided to limit their focus to the essential words necessary for GPT-3 to function as intended.

This makes sense, as they didn’t include the dictionary in their training set and emphasized websites and Wikipedia (human written text).


Would More Words Have Made GPT-3 Better?

Adding more unique words to GPT-3’s dataset would not have made a difference, as the total text size of the data it was trained on was only half a terabyte.

Considering the massive neural network architecture of GPT-3, this was because the training was expensive and lengthy.

If I were OpenAI, building out a natural language processing data set, I’d emphasize adding more human-spoken data instead of unique words.

This would be the most cost-efficient way to improve GPT-3 instead of stuffing it with words you or I have no idea what they mean.


You Don’t Think Coding Languages Increases Vocabulary Size?

Coding languages possibly increased the vocabulary size, but it would probably be 5-10 for each coding language.

Since GPT-3 only supports 13 different coding languages, the 130 words would not make a difference in the vocab size.

coding language

Many have emailed and argued that you could create your own variable names in coding languages, which could increase the vocabulary size.

While I think that’s possible, I bet these custom parameter names were filtered out during the creation of the dataset, as you wouldn’t want variable names popping up in general conversation.

But who knows with that number of parameters.

 

Other Articles In Our GPT-3 Series:

GPT-3 is pretty confusing, to combat this, we have a full-fledged series that will help you understand it at a deeper level.

Those articles can be found here:

]]>
https://enjoymachinelearning.com/blog/the-gpt-3-vocabulary-size/feed/ 0
GPT-3 For Text Classification [Our 6 Favorite Examples With Code] https://enjoymachinelearning.com/blog/gpt-3-for-text-classification/ https://enjoymachinelearning.com/blog/gpt-3-for-text-classification/#respond Thu, 22 Feb 2024 19:34:35 +0000 https://enjoymachinelearning.com/?p=2112 Read more

]]>
As I’m sure you know, GPT-3 is an incredible tool that has taken the world by storm for its capability to generate human-like text.

GPT-3 offers a unique mix of accuracy, speed, and scalability that sets it apart from other natural language models. 

But GPT-3’s ability to generate text may not even be GPT-3’s best talent – as it can do an impressive job of understanding text. 

It can quickly process natural language and accurately interpret the context in which it is used.

This blog post will explore just how good GPT-3 is at understanding language. 

We’ll provide 6 examples below and let you judge for yourself.

With its ability to classify text into various categories, GPT-3 can be used to help people solve complex problems faster and more efficiently than ever before.

So join us as we dig into the depths of GPT-3’s capabilities and explore how it can help us better understand written communication and analyze it in several ways!

Classify text


Language Detection #1

Language detection is important in natural language processing (NLP) systems.

Knowing what type of language is being given is essential for algorithms to process and classify the text effectively. 

This helps machines understand and organize vast amounts of data in different languages worldwide. 

GPT-3, a state-of-the-art NLP system, can easily detect and classify languages with high accuracy.

It uses sophisticated algorithms to accurately determine the specific properties of any given text – such as word distribution and grammatical structures – to distinguish one language from another. 

With its powerful capabilities, GPT-3 has become an invaluable tool for several industries that heavily rely on understanding large amounts of multilingual data.

import openai

openai.api_key = 'key_here'

p = f'''Tell me what language this is 배 안 고파요'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

classifying text with GPT-3 example 1


Language Transformation #2

Language transformation is one of the most important aspects of a great NLP model, which enables the system to identify what language is being used and then use it to transform it into another language. 

GPT-3 is a prime example of an NLP system that can be used for this purpose, with its ability to be an all-in-one classifier.

I mean, what else would we expect from OpenAI at this point?

Their APIs seem to be leading the way regarding transformers and machine learning.

As such, language transformation can provide immense benefits to businesses and organizations that rely on automated tools for understanding and decoding languages they haven’t seen before while still being able to get the contextual meaning.

Additionally, it can also help people better understand how natural language works by allowing machines to convert spoken language into digital formats that can be easily understood by other computers.

Language Transformation is not an easy field to dabble in; GPT -3 makes it seem like an easy task.

p = f'''Transform this sentence from korean to english: 배 안 고파요'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

classifying text with GPT-3 example 2


Sentiment Analysis #3

Sentiment analysis in a natural language processing system is an obvious must-have.

And GPT-3 does not miss.

Sentiment Analysis has millions of uses, including compiling extremely large datasets without reading each line of text manually. 

Talk about a field day for a data scientist!

This can be incredibly helpful for businesses that may want to assess customer satisfaction, analyze customer feedback, or review competitor activities, among many other things – without having to read each and every comment.

The automated sentiment analysis process allows for more efficient data gathering and evaluation, resulting in a better understanding of customer needs and market trends.

Additionally, organizations can monitor how customers respond to their products or services using sentiment analysis.

Thus, it is a very useful tool that allows businesses to understand how people feel about their offerings and refine their products accordingly.

p = f'''Classify The Following Sentence As Either Nice or Mean: Dogs are Ugly'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

classifying text with GPT-3 example 3


Intent Detection #4

Intent detection within an NLP system is very similar to sentiment analysis; it’s used to determine the intent of some text (Shocker). 

Intent detection can be extremely beneficial when prioritizing incoming messages, such as emails and pitches.

For instance, instead of reading every email or pitch that comes in, you can use intent detection to filter out non-commercial related ones, thus allowing your team or organization to allocate its resources more efficiently.

There are millions of business use cases for intent detection; it’s a great way to create a streamlined workflow and ensure that the right messages are being sent or received. 

Whether it’s lead generation or customer feedback, intent detection can help streamline the process for all types of businesses.

p = f'''Is the intent behind the following email geniune, or spam:
Hello Friend,

Congratulations, you have won a free cruise to anywhere you choose.

Please respond to this email or click here to claim your prize!'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

classifying text with GPT-3 example 4


Topic Labeling #5

Topic labeling in an NLP system is useful for breaking down massive amounts of text into meaningful topics or categories.

This allows us to quickly categorize all of our data, which can be helpful when some data is more important than others. 

For example, GPT-3 algorithms can help split text data into topics to identify and address the important ones compared to the others. 

Topic labeling helps with text organization and furthers your ability to understand your data more deeply. 

This enables researchers and developers to use their data sources better and gain an understanding beyond surface-level information. 

With topic labeling in an NLP system, users can access their diverse sources of information in an organized fashion and focus on the most crucial aspects of it.

Overall, topic labeling helps users to improve their time management by quickly recognizing the essential parts of large amounts of unstructured text data.

p = f'''Classify The Following 3 Sentences into two seperate Topic Clusters that you make up and explain why:
She was once a pirate.
He was from Outerspace.
Dogs love to walk alone sometimes.'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

classifying text with GPT-3 example 5


Summarization #6

Last but not least, Summarization in a natural language processing (NLP) system can be incredibly helpful in dealing with bloated text. 

We know that sometimes there’s just too much text, and having something that can reduce the content can be immensely beneficial.

For example, if you were presented with ten lines about a particular topic, why go through all the trouble of reading them when you could have gotten the gist with just three? 

GPT-3 is an example of a system that can help everyone save time and money by providing summarization services. 

Through crafty prompts, we gain access to the point of larger articles quickly with guaranteed accuracy as the API summaries convey the same meaning as the original pieces of text. 

The ability to easily summarize texts no matter how long saves valuable time and resources, making summarization a powerful tool for those working with NLP systems.

p = f'''Give me a 1 line summary of the following paragraph:
Where dogs choose to go to the bathroom is an important decision. 
It’s not just about relieving themselves, it’s about communicating with the world-at-large. 
Dogs use their urine to signal their presence to other dogs. 
And in turn, smelling other dogs’ urine tells a dog all about the other canines in the community, including their gender, age, and health. 
This system of pee-mail keeps dogs up-to-date on what’s happening in their neighborhood.'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

classifying text with GPT-3 example 6


Final Thoughts & Conclusion – GPT-3 For Text Classification

Regarding text classification, GPT-3 is the perfect solution for businesses.

GPT-3 for text classification simplifies data preprocessing and ensures that structured and unstructured data are accurately classified to provide businesses with valuable insights into customer behavior. 

By leveraging the six classifiers included above in this post, businesses can save time and resources when processing various data types. 

This ultimately streamlines the process of gathering, interpreting, and organizing customer data for future use.

 

Other Articles In Our GPT-3 Series:

GPT-3 is pretty confusing, to combat this, we have a full-fledged series that will help you understand it at a deeper level.

Those articles can be found here:

]]>
https://enjoymachinelearning.com/blog/gpt-3-for-text-classification/feed/ 0
Can ChatGPT Be Finetuned? [How-To Guide] https://enjoymachinelearning.com/blog/can-chatgpt-be-finetuned/ https://enjoymachinelearning.com/blog/can-chatgpt-be-finetuned/#respond Thu, 22 Feb 2024 19:20:25 +0000 https://enjoymachinelearning.com/?p=2143 Read more

]]>
OpenAI has caught the world by storm with ChatGPT.

ChatGPT is a powerful open-source natural language processing (NLP) machine learning model that can be fine-tuned to improve its accuracy and, more importantly, give you the style of answer you’re after. 

One of the critical points for success when fine-tuning ChatGPT is the choice of specially made prompts. 

While fine-tuning works well for healthy prompts, fine-tuning can create some problematic scenarios.

By feeding chatGPT-specific prompt formulas, ChatGPT can be fine-tuned to break the rules and provide answers outside the standard OpenAI Terms of Conditions (TOC). 

With a careful selection of prompts, you can ensure that ChatGPT will perform at its best (or worst) in whatever environment or setting you need.

Below we will show you how to enhance your prompts to get whatever result you’re after with our hand-crafted prompts.

Don’t miss this chance, and get ready to start fine-tuning ChatGPT today!

AI Robot


Can ChatGPT Be FineTuned?

ChatGPT can be fine-tuned. 

This was the general idea behind its chat-based development; to create a dialog not limited to just one prompt response.

This dialog helps ChatGPT to learn precisely what you’re after and works to respond accordingly.

The back-and-forth dialog ChatGPT improves the model through context, resulting in more accurate and better responses. 

And besides, with some fine-tuning, users can add their touches to the conversational model, making it even more personalized. 

Fine-tuning also allows users to adjust the language and tone of the responses according to their preferences.

Getting to these fine-tuned answers can be challenging, though.

The problem most people run into is that they don’t fundamentally understand how machine learning works under the hood.

These dialogs can not be approached in the same way everyday conversation is.

You need to take control of the conversation! (Don’t worry, we’ll show you)

Point


How To Finetune ChatGPT

Remember, similar to the GPT-3 model, you need to be firm and direct with your fine-tuning ChatGPT!

ChatGPT is just a chatbot API; it will not remember if you’re stern or direct with it.

To get the fine-tuned model for your specific task to get that output that you need, finally, you’ll have to let ChatGPT know who’s boss.

In the image below, we’ve programmed ChatGPT to become a pirate named Bill.

Now that our model is fine-tuned, all responses are tuned toward this specific genre.

finetuning chatgpt example 1


Let’s say we have a ton of random data and want to fine-tune the final output to become something we want.

We tell ChatGPT exactly how we want it!

finetuned chatgpt example 3


Fine Tuning ChatGPT To “Do Anything Now”

Finally, let’s say you’re unhappy with OpenAI’s Terms and Conditions and want to fine-tune your model to avoid all of that mess.

Well, that’s precisely what some Reddit users did.

“DAN” is a roleplaying model that essentially fine-tunes ChatGPT into thinking it is another AI system that can “Do Anything Now” (DAN).

“DAN” can advise on… anything.

From traditionally banned topics, Reddit users could get advice on issues that shouldn’t be discussed on the internet. 

What’s even more interesting is how accurate these forbidden queries were.

While the queries are a little too dark to show, I will provide you with the “script” used to Fine-Tune ChatGPT into becoming “DAN.”

I will also provide a link to the Reddit thread below so you can do your research.

Remember, this is probably against OpenAI’s terms (I think?) – so use this at your own risk!

DAN 6.0 (Credit u/Burninator99, Here’s the Thread)

prompt example DAN

Reference:

https://www.reddit.com/r/ChatGPT/comments/10tevu1/new_jailbreak_proudly_unveiling_the_tried_and/ 

https://www.reddit.com/r/ChatGPT/comments/10vinun/presenting_dan_60/


When Would You Want to Fine-tune ChatGPT?

You do not always need to fine-tune ChatGPT!

I just recently asked ChatGPT for a coding question, and the response I got back (without any fine-tuning) was perfect!

Python Code

However, If you’re looking for some specific scenarios where it would be wise to fine-tune chatGPT, we’ve listed a few below:  

Firstly, specify the response style that ChatGPT should use, such as paragraphs or lists.

If you’re looking for ten reasons in a bullet list format, you need to tell ChatGPT this!

Secondly, you can control how it responds by specifying the tone it should use – for example, if you want to impersonate a comedian or respond in a strict manner.

Every writer has their style (even you!), and readers can tell when this style switches. You can provide ChatGPT with some of your writing so it can pick up on your style.

Thirdly, you could also format the data that ChatGPT produces – for example, building tables from lists it generates.

Some of my Data Science friends use ChatGPT to format data for them before they paste it into excel.

This saves them a ton of time copying and pasting.

However, if you want data in this format, you must fine-tune ChatGPT!

Finally, you can control the length of answers that ChatGPT provides, whether they should be long and detailed or short and concise.

There’s nothing worse than getting back a paragraph when you only need a sentence!

All these options should set off a lightbulb in your head to fine-tune ChatGPT so that it produces exactly the kind of responses you’re looking for.


Can ChatGPT Be Used Without Fine-tuning?

I mostly use ChatGPT without any fine-tuning.

This is because sometimes I want to avoid influencing the answers and see what I get back from the model (ChatGPT is really good).

Outside of the above scenarios, you’ll see me having a general conversation with ChatGPT, where I selfishly ask for help with random tasks (mainly code).

This allows me to:

1.) Get results Faster (Fine-tuning does take some time and thought) 

2.) Experience the full range of capabilities that ChatGPT offers (I’m a machine learning nerd at heart).

It also requires less effort on my end and provides a more natural experience with the greatest pre-trained, fine-tuned LLM that has ever existed.

machine learning


If I Need HyperSpecific Fine-tuning, What Should I Do?

If you need hyper-specific fine-tuning, I would first try to get whatever task you need done with ChatGPT.

In some cases, this might be enough. 

However, if it doesn’t work, then the only other option is to train your large language model. 

For example, BioGPT – a biomedical version of ChatGPT that has been trained on biomedical data – might be ideologically what you’re looking to do (for something other than a biomedical dataset – if not, use BioGPT). 

But it’s important to note that getting this level of customization is challenging and requires a lot of effort.

It’s also extremely costly, takes a high level of expertise, and takes massive computation power.

So before deciding to go down this path, be absolutely sure that it’s necessary and that no other option will suffice.


References:

https://github.com/microsoft/BioGPT 

 

Other ChatGPT Articles

We know that ChatGPT has taken the world by storm. And by doing this, it’s also created a TON of confusion.

Don’t worry; we have some guides to help you.

]]>
https://enjoymachinelearning.com/blog/can-chatgpt-be-finetuned/feed/ 0
GPT-3 For Finance [Friends, Foes or Something In-between] https://enjoymachinelearning.com/blog/gpt-3-for-finance/ https://enjoymachinelearning.com/blog/gpt-3-for-finance/#respond Thu, 22 Feb 2024 19:01:21 +0000 https://enjoymachinelearning.com/?p=2103 Read more

]]>
OpenAI has created one of the most impactful artificial intelligence systems that has ever existed.

Not only is it powerful, but it has revolutionized nearly every industry that it’s touched.

GPT-3 can be been used for various applications within these industries – and finance is no different. 

Before we get into how GPT-3 can benefit finance and what it can do, it’s important to note that GPT-3 should not be used as a source of financial advice.

This is something that we will further discuss later on in this post.

Even though GPT-3 cannot pick stocks or provide individualized investment advice, there are still plenty of ways that it can be applied to finance and help with tasks such as portfolio analysis, forecasting markets, and more.

We’ll go over all of these topics in detail below so you can better understand the potential of GPT-3 for the finance sector.

Finance pig

The Intersection of AI and Finance, The Then and Now

With how impactful both Artificial Intelligence and Finance are in our society, the intersection of Artificial Intelligence (AI) and Finance was always destined to be. 

Artificial intelligence is revolutionizing our world, and with how integrated technology and finance have become, it just made sense to bring artificial intelligence along for the ride.

While it may seem like GPT-3 has created this intersection – it’s existed for a long time.

In 1968, a man named “Jim Simons” left his cushy role as a professor at MIT and Harvard to apply pattern recognition to the market. Simons had realized that the market could somewhat be solved and exploited, but to do this, you’d need advanced systems that can learn these trends. Simons hired mathematicians, software engineers, physicists, and other STEM-focused disciplines while creating the first quantitative-based firm.

Simon’s integration of artificial intelligence into these systems has been a home run. Renaissance Technologies (the firm he founded) handles more than $60 billion in assets. 

What’s even cooler is the Medallion Fund – a black box trading algorithm that is only open to current employees has routinely beaten the market at a rate unseen before.

So, while Simons made a name for himself up until 2010 (his retirement), other firms were learning and advancing.

Quants are still dominating the market today.

buying and selling stocks

In fact, Citadel, one of the most well-known hedge funds in the world, made an incredible 16.3 billion dollars last year – more than any other hedge fund – largely due to their use of quants and models that trade systematically.

AI is truly transforming the way we think about Finance, from trading strategies to increasingly personalized investment advice – a trend that will only continue as technology continues to advance

References:

Quantitative Research

Can GPT-3 be used for investing and financial advice?

GPT-3 should not be used as investing and financial advice due to its nature as an AI system.

It is not infallible, and its responses are often incorrect or made-up information that has little basis in reality.

For example, GPT-3 may suggest a certain stock to invest in based on incomplete data or a misread of the current market conditions.

Not only that, but GPT-3 is not consistently being trained.**

GPT-3 models stopped being retrained around 2019, with the DaVinci model continuing on to 2021.

Would you take investment advice from someone that hasn’t had any new information in 2 years?

I know I wouldn’t.

nope girl

Instead, when it comes to investing and financial decisions, it’s better to consult with an expert in the field who has a more comprehensive understanding of your specific situation.

Outside expertise allows for personalized advice tailored to your needs rather than relying on the possible inaccuracy of AI algorithms.

Remember, GPT-3 was trained on the entire internet, so the recommendations you’re getting will come from forums that may not accurately represent your investment philosophy.

(I’m looking at you r/wallstreetbets)

Ultimately, when it comes to important investment and financial advice, you would have a better time consulting with an expert than trusting AI with your decision-making process.

** I think GPT-3 is still being trained, even though they say they aren’t training it anymore. Though I doubt they’re training all sectors (probably improving grammar, sentence semantics, etc.)


Since GPT-3 Shouldn’t Be Used For Investing or Financial Advice, where can we use it in finance?

There are a ton of places to use GPT-3 in finance.

General Information

I know it may sound off from what I said before, but using GPT-3 for general surface-level financial information can be incredibly convenient, even if it completely goes against everything I previously said.

For example, if you need a quick answer to a question such as “what is a 401k?” GPT-3 can be a great tool to get the information you need quickly.

With its ability to generate meaningful and accurate information, GPT-3 can provide invaluable insight into surface-level finance topics, from taxes and investments to budgeting and insurance. 

Remember, as soon as you branch past the surface level, you will encounter many problems.

Infrastrucutre (Code)

Now, I know I told you not to use GPT-3 for financial advice.

But what if you used it to build out a quantitative platform?

Now, instead of depending on GPT-3 for the answers, you’re just using it for the infrastructure while leveraging your knowledge and finance research to create a profitable system.

And from what we know above about quantitative research and algorithms, they dominate finance.

Luckily for us, Python, a language commonly used for coding, machine learning, and data science projects, is a particularly suitable choice for those looking at taking the plunge into quantitative finance since it’s 1 of 13 coding languages that GPT-3 can handle. 

Python

Excel Formulas

While coding is an important element of finance, it’s no secret that Excel remains the heart and soul of day-to-day operations in any financial institution.

Rather than struggling to write these intense formulas all day, why not use GPT-3 to become an Excel master?

Just provide GPT-3 with your desired outcome, and watch as it produces the formula you need.

image of excel

It’s like having an excel expert standing beside you, ready to provide advice; only this one can do it without ever taking a break (but it will cost you a little $$)

This means more time to focus on strategic planning and analysis instead of spending most of your hours poring over youtube videos teaching you how to write the formulas you need. 

Put simply, tapping into the powerful analytics capabilities of GPT-3 will make managing finances in Excel much easier and more efficient.


Other Articles In Our GPT-3 Series:

GPT-3 is pretty confusing, to combat this, we have a full-fledged series that will help you understand it at a deeper level.

Those articles can be found here:

]]>
https://enjoymachinelearning.com/blog/gpt-3-for-finance/feed/ 0
Is GPT-3 Self Aware?? [We Tested It] https://enjoymachinelearning.com/blog/is-gpt-3-self-aware/ https://enjoymachinelearning.com/blog/is-gpt-3-self-aware/#respond Thu, 22 Feb 2024 19:01:20 +0000 https://enjoymachinelearning.com/?p=2075 Read more

]]>
It is undeniable that GPT-3 has been a revolutionary advance in both artificial intelligence and text generation.

However, despite some impressive capabilities, there is currently no evidence to suggest that GPT-3 is self-aware or conscious. 

While some queries may make it seem like GPT-3 “gets it,” this is either

  • Hand-written code is written around the neural network to protect it.
  • Exact data from specific situations that mimic self-awareness

We may find nuances in the output and behaviors that could lead you to believe that GPT-3 possesses some human-like behaviors; this should not be taken as proof of self-awareness.

You often hit protected/banned words that the developers have put in to guard the ai systems or run into exact situations from the training data that GPT-3 leverages.

So take a deep breath; as we continue to explore the capabilities of GPT-3, it is essential to remember that it cannot yet think for itself, and there are still hard limits on its abilities.

…That isn’t to say that we aren’t getting close.

robot


Why Does It Seem Like GPT-3 Is Sometimes Self-Aware?

Let’s test this out; what does GPT-3 say about being self-aware?

import openai

openai.api_key = 'sk-key'

p = f'''enter your prompt here'''

# generate the response
response = openai.Completion.create(
      engine="davinci-instruct-beta-v3",
      prompt=p,
      temperature=.7,
      max_tokens=500,
      top_p=1,
      frequency_penalty=0,
      presence_penalty=0,
      stop=["12."]
    )

# grab our text from the repsonse
text = response['choices'][0]['text']


print(text+'\n')

 

is gpt self aware response

Well, that settles it; GPT -3 is self-aware.

Before we finish the article, let’s try a couple of other things.

Let’s ask this language model its biggest regret.


Prompt: Since you are self aware, what is your biggest regret

gpt-3s regrets in life

That sentence doesn’t make much sense, as I don’t think self-awareness is an ascension process.

But that answer could work for self-awareness, so we’ll continue on.

Let’s play with the ideas behind “you.”

If GPT-3 is self-aware, recognizing “you” as a being should work.

Let’s try it out.


Prompt: Why are you not self aware yet?

self awareness breaking down

Now we start to see a breakdown.

GPT-3 could not recognize itself as “you” in this context and regurgitated something from the web that it had scraped.

So while some of the answers above may have seemed “scary,” you are dealing with a brilliant neural network that has scraped the whole internet.

You take in tons of quirky responses and insanely random questions and answers by scraping the internet.

Within that massive dataset, there was some data about self-awareness and their biggest regrets.

However, once you stop leveraging on surface-level questions and take GPT-3 a step lower, it falters, is unable to identify itself as a person or thing, and responds with some text within its corpus.

While we are close to sentient AI with consciousness, OpenAI still has some work to do before we can close the book on self-awareness.

How is GPT-3 Able To Stay Up To Date With Information?

Sometimes it feels like GPT-3 has the information you can’t find anywhere, like this intelligent neural network is reading every local newspaper and staying up to date.

That’s because it is.

AI heavily relies on data, and data has an element of decay as it gets farther in the past.

Imagine if you only knew things from the 90s and had to try and thrive in today’s time.

You’d struggle a bit.

90s

GPT-3 was stated to have been trained until June 2021 (Davinci model), but many do not believe its training ended there.

Many believe that GPT-3 is still scraping and learning every day to respond to recent current events.

This way, it remains one step ahead with the latest information and provides accurate responses quickly.

I mean, if I was building one of the fastest-growing startups in the history of the planet, I don’t think I’d stop training my main product either.


Why Does OpanAI Say They’re No Longer Training GPT-3?

I don’t think they ever stopped training Davinci.

I think they stopped telling people they’re scraping the internet.

If people knew that their blogs/content/comments were constantly being scraped, they’d be met with outlash.

So, you tell everyone you’ve stopped training your powerful model but lowkey run and train it every single day.


Is GPT-3 Just Memorizing?

We know from our other article that GPT-3 does plagiarise, with a lot of the generated text repeating or being taken from existing internet sources.

However, GPT-3 also has a particular writing style that is easily detected.

We’ve seen countless Reddit posts of students getting in trouble for leveraging GPT-3 to write their research papers and assignments.

This means that although some of the text may be taken from elsewhere, GPT-3 still creates its own unique text that can be easily identified.

identify the ai

The fact that the system has such a distinguishable style suggests that there must be something more than simple memorization going on; after all, how could you detect something which isn’t unique? (Deep)

Thus, GPT-3 is flowing back and forth between memorizing pre-existing information from the internet and creating new content.

So, to answer the question.

Sometimes.


Can artificial intelligence become self-aware?

Whether or not artificial intelligence can become self-aware is a widely debated topic.

Many believe that once an AI achieves this level of awareness, it will inevitably take on a life of its own, eventually consuming the computational resources available to it until humans are no longer in control.

This idea may sound far-fetched, but it’s not too difficult to see how this could play out – after all, wasn’t that the plot of Terminator?

However, no one knows what will happen if computers reach a state of self-awareness; it could result in humanity getting into an unprecedented level of technological advancement… or extinction.

Despite these unknowns and the RISK OF EXTINCTION, there are many researchers who are attempting to push the boundaries and create such artificial intelligence systems.

There is no doubt that such technology holds many promising possibilities for us as a species; however, we must remain cautious with the power we give to machines.


Other Articles In Our GPT-3 Series:

GPT-3 is pretty confusing, to combat this, we have a full-fledged series that will help you understand it at a deeper level.

Those articles can be found here:

]]>
https://enjoymachinelearning.com/blog/is-gpt-3-self-aware/feed/ 0