Forms
Learn how to create, manage, and analyze form definitions on the Weav.ai platform. This guide covers creating forms, updating definitions, filtering instances, and running analytics for efficient document processing.
#Overview
The Forms section enables users to define, manage, and analyze form structures on the Weav.ai platform. By setting up form definitions, users can extract structured data, run analytics, and streamline workflows with customizable fields and parameters. This guide provides detailed instructions for creating forms, filtering instances, and executing analytics on extracted data.
Prerequisite - To get started, ensure your environment is properly configured by following the Setup Guide.
#Create form
Creating a form definition. A form definition is required before running process_form
workflow.
python3 documents/forms/create_form.py --name "new form" --category "new" --description "test" --is_shared true --is_searchable true --fields "[{\n \"name\": \"MICROSOFT FORM\",\n \"description\": \"A form for microsoft\",\n \"category\": \"ANNUAL REPORT\",\n \"fields\": [\n {\n \"name\": \"Cost of revenue\",\n \"field_type\": \"Number\",\n \"is_array\": false,\n \"fill_by_search\": false,\n \"description\": \"Extract cost of revenue\"\n }\n ],\n \"is_searchable\": false,\n \"is_shared\": false\n}]"
Parameters:
Parameter | Description | Required/Optional | Allowed values |
---|---|---|---|
name | The form name | Required | |
category | A category name for the form | Required | |
description | The description of the form | Required | |
is_shared | A flag to decide sharing permissions | Optional (default : False) | false, f, False, true, t, True |
is_searchable | A flag to decide visibility | Optional (default : False) | false, f, False, true, t, True |
fields | Form fields | Optional | Stringified JSON |
Format for fields
[{
"name": "str",
"description": "str",
"category": "str",
"fields": [
{
"name": "str",
"field_type": "Number",
"is_array": boolean,
"fill_by_search": boolean,
"description": "str"
}
],
"is_searchable": boolean,
"is_shared": boolean
}]
## Example stringified version for CLI
"[{\n \"name\": \"MICROSOFT FORM\",\n \"description\": \"A form for microsoft\",\n \"category\": \"ANNUAL REPORT\",\n \"fields\": [\n {\n \"name\": \"Cost of revenue\",\n \"field_type\": \"Number\",\n \"is_array\": false,\n \"fill_by_search\": false,\n \"description\": \"Extract cost of revenue\"\n }\n ],\n \"is_searchable\": false,\n \"is_shared\": false\n}]"
Key | Description | Required/Optional | Allowed values |
---|---|---|---|
name | Name of the Entity | Required | |
field_type | The data type of the entity | Required | “Number”, “Date”, “Text”, “Table” |
description | A short instruction to the prompt about the field | Optional | |
is_array | If it’s an entity with multiple values | Optional (default: False) | True, False |
fill_by_search | Use internet search to fill this information | Optional (default: False | True, False |
Response:
{
"category":"new",
"created_at":datetime.datetime(2024, 9, 25, 20, 12, 11, tzinfo=datetime.timezone.utc),
"description":"True",
"fields":[
{
"description":"Net Sales for the quarter",
"field_type":"Number",
"fill_by_search":false,
"is_array":true,
"name":"Net Sales"
}
],
"id":"66f46e9b70dd6d497d9b8a37",
"is_searchable":true,
"is_shared":true,
"name":"new form",
"user_id":"google-oauth2|117349365869611297391"
}
#Delete Form definition
Deleting a form definition. The user is prompted to reconfirm the deletion.
python3 documents/forms/delete_form_definition.py --form_id 66ea66d547fff0950cba17e
Parameters:
Parameter | Description | Required/Optional |
---|---|---|
form_id | The unique identifier of the form | Required |
Response:
{
"category":"new",
"created_at":datetime.datetime(2024, 9, 25, 20, 12, 11, tzinfo=datetime.timezone.utc),
"description":"True",
"fields":[
{
"description":"Net Sales for the quarter",
"field_type":"Number",
"fill_by_search":false,
"is_array":true,
"name":"Net Sales"
}
],
"id":"66f46e9b70dd6d497d9b8a37",
"is_searchable":true,
"is_shared":true,
"name":"new form",
"user_id":"google-oauth2|117349365869611297391"
}
#Filter form
Search capabilities to retrieve form definitions
python3 documents/forms/filter_form.py --query "SECURITIES AND EXCHANGE COMMISSION" --scope "all_forms"
Parameters:
Parameter | Description | Required/Optional | Allowed values |
---|---|---|---|
scope | The scope of search | Required | all_forms, my_forms |
is_searchable | Filter for visibility | Optional (default : False) | false, f, False, true, t, True |
query | When applied, string matches category | Optional (default : “”) |
Response:
{
"forms":[
{
"category":"SECURITIES AND EXCHANGE COMMISSION",
"created_at":"2024-09-25T09:13:50Z",
"description":"",
"fields":[
{
"description":"",
"field_type":"Date",
"fill_by_search":false,
"is_array":false,
"name":"testEnitity1"
}
],
"id":"66f3d44eeb87303bc52bb9b4",
"is_searchable":false,
"is_shared":false,
"name":"test",
"user_id":"google-oauth2|117349365869611297391"
}
]
}
#Get form definitions
Fetching the form definition of a particular form.
python3 documents/forms/get_form_definition.py --form_id "66f46e9b70dd6d497d9b8a37
Parameters:
Parameter | Description | Required/Optional |
---|---|---|
form_id | Form ID | Required |
Response:
{
"category":"SECURITIES AND EXCHANGE COMMISSION",
"created_at":"2024-09-25T09:13:50Z",
"description":"",
"fields":[
{
"description":"",
"field_type":"Date",
"fill_by_search":false,
"is_array":false,
"name":"testEnitity1"
}
],
"id":"66f3d44eeb87303bc52bb9b4",
"is_searchable":false,
"is_shared":false,
"name":"test",
"user_id":"google-oauth2|117349365869611297391"
}
#Update form definition
Updating the definition of a form such as name, description, entities and their data etc.
python3 documents/forms/update_form_definition.py --form_id 66f46e9b70dd6d497d9b8a37 --name "update" --category "new" --description "Test desc" --is_shared false --is_searchable false
Parameters:
Parameter | Description | Required/Optional | Allowed values |
---|---|---|---|
form_id | Form identifier | Required | |
name | Form name | Required | |
category | Form category | Required | |
description | Form description | Optional (default : “”) | |
is_shared | Filter for sharing permissions | Optional (default : False) | false, f, False, true, t, True |
is_searchable | Filter for visibility | Optional (default : False) | false, f, False, true, t, True |
fields | Form fields | Optional | Stringified JSON |
Format for fields
[{
"name": "str",
"description": "str",
"category": "str",
"fields": [
{
"name": "str",
"field_type": "Number",
"is_array": boolean,
"fill_by_search": boolean,
"description": "str"
}
],
"is_searchable": boolean,
"is_shared": boolean
}]
## Example stringified version for CLI
"[{\n \"name\": \"MICROSOFT FORM\",\n \"description\": \"A form for microsoft\",\n \"category\": \"ANNUAL REPORT\",\n \"fields\": [\n {\n \"name\": \"Cost of revenue\",\n \"field_type\": \"Number\",\n \"is_array\": false,\n \"fill_by_search\": false,\n \"description\": \"Extract cost of revenue\"\n }\n ],\n \"is_searchable\": false,\n \"is_shared\": false\n}]"
Key | Description | Required/Optional | Allowed values |
---|---|---|---|
name | Name of the Entity | Required | |
field_type | The data type of the entity | Required | “Number”, “Date”, “Text”, “Table” |
description | A short instruction to the prompt about the field | Optional | |
is_array | If it’s an entity with multiple values | Optional (default: False) | True, False |
fill_by_search | Use internet search to fill this information | Optional (default: False | True, False |
Response:
{
"category":"SECURITIES AND EXCHANGE COMMISSION",
"created_at":"2024-09-25T09:13:50Z",
"description":"",
"fields":[
{
"description":"",
"field_type":"Date",
"fill_by_search":false,
"is_array":false,
"name":"testEnitity1"
}
],
"id":"66f3d44eeb87303bc52bb9b4",
"is_searchable":false,
"is_shared":false,
"name":"test",
"user_id":"google-oauth2|117349365869611297391"
}
#Filter form instances
A search query for retrieving form instances.
python3 documents/forms/filter_form_instances.py --scope all_documents --status "DONE" --category "SECURITIES AND EXCHANGE COMMISSION"
Parameters:
| Parameter | Description | Required/Optional | Allowed values | | — | — | — | — | | scope | The scope of search | Required | “all_documents”, “current_document”, “my_documents”, “shared_documents” | | status | Status of workflow | Optional | “NOT_STARTED”, “IN_PROGRESS”, “DONE”, “FAILED” | | category | Category of | Optional | | | query | When applied, string matches category | Optional | | | form_id | Form identifier | Optional | | | doc_id | Document identifier | Optional | | | only_latest | Fetches only latest | Optional (default True) | | | skip | Number of documents to skip | Optional (default 0) | | | limit | Max number of documents | Optional (default 25) | | | all | If set to true, all instances are fetched | Optional (default : False) | |
Response:
{
"total":4,
"form_instances":[
{
"form_instance":{
"data":[
{
"name":"testEnitity1",
"value":"2023-08-03T00:00:00",
"identifier":"1f929c08-655e-4cf2-845c-18e3eb428ce7",
"weav_page_number":24
}
],
"metadata":{
"modified_at":"2024-09-25T22:09:52.016000",
"status":"DONE"
}
},
"doc_id":"66e0fba3089fbd21c4dd80c3",
"form_id":"66f3d44eeb87303bc52bb9b4",
"file_name":"AAPL_10Q.pdf",
"status":"AI_READY",
"category":"SECURITIES AND EXCHANGE COMMISSION",
"in_folders":[
"66e0f93093798ee1c937e39a"
],
"owner_id":"google-oauth2|117349365869611297391"
},
.
.
.
.
]
}
#Execute form analytics
Running analytics on a form.
python3 documents/forms/execute_form_analytics.py --form_id 66f3d44eeb87303bc52bb9b4 --skip 0 --limit 25
Parameters:
Parameter | Description | Required/Optional |
---|---|---|
skip | Number of documents to skip | Optional (default 0) |
limit | Total number of documents to consider | Optional (default 25) |
form_id | Form Identifier | Required |
query | This is a mongo pipeline query that is fetched from the Search service. (default: “{\n ‘reason_for_no_pymongo_pipeline’: ‘No user request provided’\n}”) | Optional |
Response:
{
"columns":[
"Net Sales",
"Total sales"
],
"results":[
{
"Net Sales":[
81797.0,
82959.0,
.
.
.
.
],
"metadata":{
"_id":"66fe1b65b1d0dfb13c9975f0",
"file_name":"AAPL_10Q.pdf",
"in_folders":[
]
}
},
{
"Net Sales":[
81797.0,
82959.0,
.
.
.
.
],
"Total sales":94569.0,
"metadata":{
"_id":"66fe29e1eb87303bc52bba93",
"file_name":"AAPL_10Q.pdf",
"in_folders":[
]
}
}
],
"summary":"",
"total_count":2
}
#Download query result
python3 documents/forms/download_query_result.py --form_id 66f3d44eeb87303bc52bb9b4 --download_format "JSON"
Parameters:
Parameter | Description | Required/Optional | Allowed values |
---|---|---|---|
form_id | Form Identifier | Required | |
query | This is a mongo pipeline query that is fetched from the Search service. (default: “{\n ‘reason_for_no_pymongo_pipeline’: ‘No user request provided’\n}”) | Optional | |
download_format | The format in which the results need to be viewed | Optional (default : “JSON”) | JSON, CSV |
Response:
--download_format = "JSON"
{'docs': [{'testEnitity1': '2023-08-03T00:00:00'}]}
--download_format = "CSV"
testEnitity1
0 2023-08-03T00:00:00