Skip to content

Commit

Permalink
update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
abrichr committed Apr 12, 2023
1 parent c097825 commit 3305fd1
Showing 1 changed file with 12 additions and 7 deletions.
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,6 @@ Here are some stubs and suggestions to help you get started with your implementa
2. Create a new file, `gui_process_automation.py`, and import the necessary libraries:

```python

import cv2
import numpy as np
from segment_anything import SamPredictor, sam_model_registry
from transformers import GPTJForCausalLM, GPT2Tokenizer
from paddleocr import PaddleOCR
Expand Down Expand Up @@ -93,24 +90,32 @@ ocr = PaddleOCR()

```python

def generate_input_event(new_screenshot, input_events):
# TODO: Implement the function to generate a new InputEvent based on the previous InputEvents and the new Screenshot
def generate_input_event(new_screenshot, recording):
# TODO: Implement the function to generate a new InputEvent based on the new Screenshot and the previous Recording
pass
```


5. In the `generate_input_event` function, you may want to follow these steps:

a. Use the Segment Anything library to segment the objects in the new screenshot.
a. Use the Segment Anything library to segment the objects in the new and previous screenshots.

b. Use the PaddleOCR library to extract text information from the new screenshot.
b. Use the PaddleOCR library to extract text information from the new and previous screenshots.

c. Generate textual prompts based on the segmented objects and extracted text, and use the GPT-J model to predict the next InputEvent properties.

d. Create a new InputEvent object based on the predicted properties and return it.

6. Write unit tests for your implementation in a separate file, `test_gui_process_automation.py`.


### Bonus

7. Use the HuggingFace transformers library to extract features from Screenshots and InputEvents and generate
InputEvent replay sequences directly (end-to-end).



### Wrapping Up

Once you have implemented the `generate_input_event` function and written unit tests, commit your changes to your forked repository, create a pull request, and provide a brief summary of your approach, assumptions, and library integrations.
Expand Down

0 comments on commit 3305fd1

Please sign in to comment.