Solved: Query API with Feature extraction sends response w...

snagadiya · ‎2020-01-30

Hi Experts,

I am always getting extract result "CP_EXTRACT_RESULT_NOT_SCRUBBED" from query api. Please find my request below.

Request#1

Upload API request:

{\"request\":{\"file_name\":\"DOCX.docx\",\"file_type\": \"docx\",\"features\":[\"extraction\"],\"extraction\":{\"method\":\"clean\"}}}

Upload API Response:

{

"response": {

"status": {

"code": 1002,

"label": "UPLOAD_SUCCESS",

"message": "The file was uploaded successfully."

},

"sha1": "8064ff3d851f273df43376cfcb9c2ebd47131c8b",

"md5": "f78a90963ca8a382da6611eb5cdbe2e3",

"sha256": "056c1f0d31faa557cdac687b0fcc5103cc4aa0dbf8027499303e182754c981b8",

"file_type": "docx",

"file_name": "DOCX.docx",

"features": [

"extraction"

],

"extraction": {

"method": "clean",

"tex_product": false,

"status": {

"code": 1002,

"label": "UPLOAD_SUCCESS",

"message": "The file was uploaded successfully."

}

Request#2

Query API Request:

{
"request": [
{
"sha1": "8064ff3d851f273df43376cfcb9c2ebd47131c8b",
"file_name": "DOCX.docx",
"file_type": "docx",
"features": ["extraction"],
"extraction": {"method": "clean"}
}
]
}

Query API Response:

{

"response": [

{

"status": {

"code": 1001,

"label": "FOUND",

"message": "The request has been fully answered."

},

"sha1": "8064ff3d851f273df43376cfcb9c2ebd47131c8b",

"file_type": "docx",

"file_name": "DOCX.docx",

"features": [

"extraction"

],

"extraction": {

"method": "clean",

"extract_result": "CP_EXTRACT_RESULT_NOT_SCRUBBED",

"output_file_name": "DOCX.docx",

"extraction_data": {

"input_extension": "docx",

"input_real_extension": "docx",

"message": "Skipped",

"output_file_name": "",

"protection_name": "Potential malicious content extracted",

"protection_type": "Content Removal",

"protocol_version": "1.0",

"risk": 0.0,

"scrub_activity": "The file doesn't include cleanable parts",

"scrub_method": "Clean Document",

"scrub_result": 4.0,

"scrub_time": "0.04",

"scrubbed_content": ""

},

"tex_product": false,

"status": {

"code": 1001,

"label": "FOUND",

"message": "The request has been fully answered."

}

]

}

Can someone please help me what mistake am I doing here ? why query api response is not sending download file id ?

PhoneBoy · ‎2020-02-15

I uploaded the file here and emulated it: https://threatpoint.checkpoint.com/ThreatPortal/emulation
Seems to be a clean file with noting potentially malicious to extract from it.
Which suggests, for that file, the response you got from the API was correct.

When I upload a different file to the above site, I eventually get a link to download the scrubbed file.
Which suggests that if you try a different more complex file (I used a PPT in my test), you might get a different result than with that more simple file.

View solution in original post

PhoneBoy · ‎2020-01-31

The reason it's not providing a fileid is most likely because the response is "The file doesn't include cleanable parts."
Which would suggest the file you provided doesn't need to be cleaned at all.

snagadiya · ‎2020-02-03

Hi,

Shall I consider that extract_result "CP_EXTRACT_RESULT_NOT_SCRUBBED" is success without fileid ?

In my case, we need to send file to check point api and download clean file from check point. So, if query API response is "CP_EXTRACT_RESULT_NOT_SCRUBBED" then I can use original file as clean file. Is my understanding correct ?

PhoneBoy · ‎2020-02-03

At least that's how I understand it.

snagadiya · ‎2020-02-05

Hi, For some of DOCX files, I am getting "CP_EXTRACT_RESULT_UNSUPPORTED_FILE" status in query response. What are the files type which are not supported for extraction method clean ? It would be more helpful if you can give me details or document which says about each extract_result in detail. Thanks in advance.

PhoneBoy · ‎2020-02-11

Not aware of any specific limitations with respect to DOCX files.
Only limit I am aware of is size-related.
I believe it's 15mb by default, can be configured to 100mb max.
A sample file that's giving this error might be helpful.

snagadiya · ‎2020-02-14

Hi,

Please find file in attachment.

PhoneBoy · ‎2020-02-15

I uploaded the file here and emulated it: https://threatpoint.checkpoint.com/ThreatPortal/emulation
Seems to be a clean file with noting potentially malicious to extract from it.
Which suggests, for that file, the response you got from the API was correct.

When I upload a different file to the above site, I eventually get a link to download the scrubbed file.
Which suggests that if you try a different more complex file (I used a PPT in my test), you might get a different result than with that more simple file.

snagadiya · ‎2020-02-15

Thank you. @PhoneBoy

So, we will consider CP_EXTRACT_RESULT_NOT_SCRUBBED as positive result with no need to download and continue to use original file only.

Are you a member of CheckMates?

Query API with Feature extraction sends response with extract_result CP_EXTRACT_RESULT_NOT_SCRUBBED