Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend Tennis Step 7 #1901

Open
25 tasks
jonsnowpt opened this issue Jan 11, 2024 · 0 comments
Open
25 tasks

Backend Tennis Step 7 #1901

jonsnowpt opened this issue Jan 11, 2024 · 0 comments

Comments

@jonsnowpt
Copy link
Contributor

8. Use the response created by OpenAI and send it to Hasura DB

TAGS APPROACH:

Implement Tag Management with Similarity Check in Content Addition

Objective: Enhance the content addition process to manage tags efficiently in the Hasura database using C#. Implement a similarity check for tags to determine if an existing tag should be used.

Requirements:

Database Schema:

Table: tags
Fields: ID, NAME, DESCRIPTION
Content Structure:

  • Content includes a tags section, e.g., {"tags": ["ATP", "Jannik Sinner", "Novak Djokovic"]}

Tag Management Logic:

When content is added, the C# code must:

  • Parse the tags section.

For each tag:

  • Check if a similar tag exists in the tags table (by NAME) with ≥70% similarity
  • If a similar tag exists, use the existing ID.
  • If no similar tag is found, insert a new tag.

Similarity Check:

Implement a function to calculate the similarity percentage between two strings.
Use an appropriate algorithm like Jaccard similarity or Levenshtein distance.

Implementation Steps:

Parse Tags from Content:

  • Extract the tags array from the content JSON.

  • [ ]
    Database Interaction:

  • Use Hasura GraphQL API

Tag Similarity Check and Insertion:

  • For each tag, query the database to check for similar tags.
  • Implement a similarity comparison function.
  • If a similar tag is found (≥70% match), use its ID.
  • If no similar tag exists, insert the new tag into the database.

Associate Tags with Content:

  • Link the content with the appropriate tag IDs (either existing or newly created).

Code Examples:

Similarity Check Function (e.g., using Levenshtein distance):

int ComputeLevenshteinDistance(string s, string t)
{
    // Implementation of Levenshtein distance algorithm
}

double CalculateSimilarity(string source, string target)
{
    int maxLen = Math.Max(source.Length, target.Length);
    if (maxLen == 0) return 1.0;

    int dist = ComputeLevenshteinDistance(source, target);
    return (1.0 - (double)dist / maxLen);
}

Tag Processing in Content Addition:

void ProcessTags(string[] contentTags)
{
    foreach (var tag in contentTags)
    {
        var similarTag = FindSimilarTag(tag); // Implement this function to search in the database
        if (similarTag != null && CalculateSimilarity(tag, similarTag.Name) >= 0.7)
        {
            // Use the existing tag ID
        }
        else
        {
            // Insert new tag into the database
        }
    }
}

SEND DATA TO HASURA:

To send the OpenAI generated content for each language to the Hasura database and ensure it is correctly associated with the appropriate author and language, you'll need to follow these steps:

1. Structure the Data:

  • First, structure the data received from OpenAI to match the fields of the authors_articles table in the Hasura database. This includes fields like seo_details, lang, permalink, data, status, published_date, tags, and author_id.

2. Map Languages to Author IDs:

  • Create a mapping in your C# application that associates each language with the corresponding author ID:
var languageAuthorMap = new Dictionary<string, int>
{
    { "en", 1 },  // English
    { "it", 2 },  // Italian
    { "es", 3 },  // Spanish
    { "pt-PT", 4 },  // Portuguese from Portugal
    { "pt-BR", 5 },  // Portuguese from Brazil
    { "sv", 6 },  // Swedish
    { "ro", 7 },  // Romanian
    { "fr", 8 }   // French
};

3. Format the Data for Each Language:

  • For each forecast generated in a particular language, format the data to create a new entry in the authors_articles table. Make sure to use the correct author_id based on the language.

4. Construct GraphQL Mutation Queries:

  • Create a GraphQL mutation query for inserting data into Hasura. Here's an example of how you might structure this query:
var mutation = new
{
    query = @"
        mutation InsertArticle($id: Int!, $seo_details: jsonb, $lang: String!, $permalink: String, $data: jsonb, $status: String!, $published_date: timestamp with time zone, $tags: jsonb, $author_id: Int!) {
            insert_authors_articles(objects: {id: $id, seo_details: $seo_details, lang: $lang, permalink: $permalink, data: $data, status: $status, published_date: $published_date, tags: $tags, author_id: $author_id}) {
                affected_rows
            }
        }",
    variables = new
    {
        id = /* ID */,
        seo_details = /* SEO Details */,
        lang = /* Language Code */,
        permalink = /* Permalink */,
        data = /* Article Data */,
        status = /* Status */,
        published_date = /* Published Date */,
        tags = /* Tags */,
        author_id = /* Author ID */
    }
};

IMPORTANT: The permalink should have the author ID at the end.

5. Send the Data to Hasura:

  • For each forecast, send a GraphQL mutation request to Hasura using the structured data. Ensure that you replace placeholders in the mutation query with actual values from the forecast data.

6. Handle Responses and Errors:

  • After sending the request to Hasura, handle the response to check if the data was inserted successfully. Implement error handling to manage any issues that might occur during the data insertion.

7. Testing and Validation:

  • Test the entire process with a few sample forecasts to ensure that data is correctly being sent to Hasura and associated with the right authors and languages.

  • Validate that the inserted data can be queried and is displayed correctly in the Hasura dashboard.

HASURA DB EXAMPLES:

betarena_prod authors articles

betarena_prod authors authors

betarena_prod authors tags

jannik-sinner-vs-novak-djokovic-betting-tip-2023-2024–picks-and-predictions-for-the-atp-finals-final-match-on-november-18th-2023-1

ACCEPTANCE CRITERIA:

  • Forecasts sent to Hasura
  • Structure the data to populate correctly the authors_articles
  • Map the forecasts to be sent to the correct Authors
  • Permalink present with the correct author username and article ID;
  • Add the process to the current Hangfire flow
@jonsnowpt jonsnowpt converted this from a draft issue Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant