The Reconstructed Patent Registry of Mandate Palestine Search FAQ

FAQ: How to Search the Reconstructed Registry?

The search options are meant to be user-friendly and intuitive, and where it is not, please blame the British Government of Mandate Palestine! We invite you to browse, explore, play, use, and share! General Queries

What are the overall trends?

In the Background section, we present some overall trends. These include patent applications over the years, the applicants’ countries of origin, patents over the years in some industries, etc.

Can I access the raw data?

Yes! If you wish to explore the raw data – you are more than welcome to do so. The underlying database is available for download here.

Do I need permission?

The reconstructed registry, raw data, and accompanying explanations are freely available for non-commercial use under a Creative Commons License Attribution-Non-Commercial 4.0 International (all uses are permitted, but for commercial uses, attribution is required).
CC BY-NC4.0

How to cite the Registry?

We suggest:

Michael Birnhack, Mandate Palestine’s Reconstructed Patent Registry (1924-1948) (Tel Aviv University, 2025), available at https://en-law.tau.ac.il/MandatePalestineIP.

How did we reconstruct the Mandate Registry?

The Methodology section provides a detailed description and explains how we solved various challenges.

How to best search for a specific patent application?

If you want to browse the dataset, the best way to see all applications is to search for “application” – which will provide all applications in the dataset and then apply further searches.
By knowing the application number. This is a good start. Be careful with deducing conclusions about priority from the numbers. Note the comments in the Methodology section and the Background section about the various channels of registration, which means that a number is only a good enough proxy for the date, but not a perfect one.
By knowing the applicant’s name. This is also a good start. Note that names may not be consistent: A person or company’s name may appear in several versions. We grouped them together where there was no doubt, e.g., “Flower Inc.” and “Flower Incorporated,” but where the names changed, we left them in the original form.
Some countries had only a few applications, so if this is what you are interested in, searching for these countries may be more effective than specific searches.

I am not sure about the spelling of applicant's name, country, or other search criteria

If you are unsure about the spelling of any of the search criteria, you can submit a general search query ("application" will provide the entire dataset) and then apply the search filters on the right of the results and choose the desired criteria of the list.

Can I Search by specific kinds of products?

Yes! You can search for patents according to the product they describe. You can do this search by two methods:

- If you know the patent’s field, you can search or filter your search results by the patent's categories. Note: the categorization was not part of the original Mandate data and is our making.
- If you don't know the technological field, search "kind of product / services."

How was the patent filed?

There were three channels to file patent applications in Mandate Palestine: (1) Applying directly in Palestine with a new original application; (2) Relying on an already-granted British patent; and (3) Claiming a priority date of an application filed in another country, based on it being a member of an international convention, namely, the Paris Convention. If you are interested in searching these categories, search first “application”, which will yield all patents in the dataset, then apply further filters by using the option of “Application Priority”:

British Registration means that the patent application was originally filed in the United Kingdom and subsequently refiled in Mandate Palestine.

Convention Application – If the patent application claimed priority from an earlier filing in that specific country under an international convention you will see the country name followed by “Convention Application”.

All other patents – were first submitted in Palestine.

The details for a particular patent indicate the priority country, the date of filing in the other country, and the date of the application in Palestine.

Can I search the Reconstructed Registry by keywords that may appear in the patent specification or claims?

Yes, you can search the database using keywords. We have applied Optical Character Recognition (OCR) technology to scanned PDF files to make the text machine-readable, enabling keyword searches in patent specifications and claims. However, please note that due to the age of the documents and some handwriting applied to them, the OCR may not be accurate. OCR technology does not always recognize handwriting or manual erasures, both of which were common in these historical files. Users should cross-reference the OCR results with the original scans for verification.

Which languages are supported by the OCR technology used in this project?

We conducted OCR in three languages: English (the most common language in the dataset), Hebrew (the second most common), and German (used in only a few applications). The accuracy of recognition varies by language, with English typically yielding the best results. Hebrew and German documents may show a higher rate of OCR errors due to font styles and document condition.

I am interested in linguistic information. Can I Search by Language?

You can search for patents in a specific language or a combination of languages. Submit an initial general query (“application” will provide the entire dataset), and then filter (on the right hand) according to the languages.

What are the AI-generated summaries, and how were they created?

The AI summary metadata field is a summary of the patent application obtained via a Large Language Model (LLM). The summaries were produced by Google’s Document AI, to which we fed the OCR outputs of each patent application. The following prompt was used to generate the summaries:“This document is a historical patent application. You are tasked with summarizing it for the purpose of library cataloging. Please give me a summary of the following text up to 150 words. Do not include names of people or dates.”

How accurate are the AI Summaries?

To begin with, these AI-generated summaries provide a quick overview but should not replace reading the full patent text for detailed analysis. Second, as of July 2025, the practices and metrics used to evaluate the accuracy of AI-generated summaries are still an active area of ongoing research. However, there are some promising pipelines for evaluating AI-generated summaries such as the one we have adopted which is named G-Eval. Put simply, G-Eval uses LLM to evaluate the quality of summaries without a ground truth. Using this approach, we have fed 12 patent applications and their corresponding AI-generated summaries to Gemini which judged and rated their quality based on four key dimensions: coherence (4.8), consistency (4.9), fluency (3.0), and relevance (4.8) (scale of 1-5 where higher is better). So while the fluency score is a bit on the low side, which may indicate the awkward phrasing used in the original patent applications, other dimensions scored almost perfectly. We should also note that we did employ some (limited) human evaluation of these summaries and judged them to be adequate for cataloging and general use purposes.

A few tips! What is the Date format?

We use the DD/MM/YYYY format (sorry, dear Americans!)

How are the Applicants listed?

The search option lists applicants according to the distribution, from the highest to lowest. The dropdown list you see contains only the first 20 most active applicants. If you know the applicant’s name, you can search for it using the regular or advanced search options.

Is there data about the applicant’s Origin?

Yes! For all countries we provide only the country’s name, rather than the city; for Palestine, we provide both the country and city. You can further filter the results by city.

We use Netherlands rather than Holland; England rather than UK or Great Britain.

Are there missing documents?

For 34 applications (or patents), there are no files. This omission was in the original registry.

What does “undetermined language” mean?

For 33 applications the language is ‘undetermined’ – for these, we do not have the applications themselves, so could not determine their original language.

Can I search by the patent’s field?

Yes. We added tags of Kind of Product or Services. This is our categorization. You may categorize otherwise, of course.

We categorized many inventions in more than one category.

What can I find about the patent agents?

In the search options and search results you will find names of patent agents.
There are a few unusual categories and comments:
If the search/result is < applicant > it means that the applicants handled their own paperwork, without the assistance of an agent. There were 231 such applications.
< Signature unidentified > means that the applicant did have a patent agent, but the name was not listed in the official documents, and we could not identify the one that appears on the patent documents themselves. There were 53 such cases.
< Unknown – no signature > means, well, that we do not know whether there was an agent and who they were. There were 46 such cases.

What does < Application Priority > mean?

- There were three channels to apply for a Palestine Patent:

(1) Local application. For these there is nothing to indicate that they are in the other options.

(2) An application based on a British patent. If you wish to search for these, select “British Registration” (272 such applications)

(3) An application based on one that was submitted in a country that was a member of an international treaty, in which case the priority date was that of the foreign application. For these, you may select the countries you are interested in. Note that there were 175 such applications that cited ‘Great Britain’ as the priority country. These should not be confused with the previous channel.

What are the British-Israeli Patents?

395 applications that were filed with the British PTO in Mandate Palestine were not decided during the Mandate, were carried over to the Israeli PTO after May 1948, and reviewed by the Israeli staff. To find these, the magic word is <Reshumot>, which is the Hebrew name of the Israeli Official Gazette.

What should I do if I find an error in the registry or the OCR text?

If you notice any errors, please let us know, using this email. Your contributions will help us enhance the registry’s quality and ensure a more reliable resource for future users. Thanks!

How did you convert the patent application files into machine-readable text?

All patent applications were originally received as image files. To convert the images into machine-encoded text we have used a technology called OCR (Optical Character Recognition) that does just that. Our OCR platform for this project was ABBYY FineReader PDF 16, with OCR speed and accuracy set to Thorough recognition and Document type set to Typewriter.

How accurate is the patent application’s OCR layer?

There are different metrics for evaluating OCR accuracy, the most common of which are called WER (Word Error Rate) and CER (Character Error Rate). Both metrics are computed against Ground Truth (GT), which is a manual transcription of the original file(s). Once GT is obtained, WER is calculated by the number of word-level errors (like misreading 'patent' as 'parent') divided by the total words in the document, while CER does the same only with character-level errors (like reading 'o' as 'a'). So, if we have a document with 100 words and the OCR makes 3 word-level errors, the WER would be 3/100 = 3%. Similarly, if that same document contains 500 characters total and the OCR makes 5 character-level errors, the CER would be 5/500 = 1%.

To evaluate the accuracy of our patent application's OCR we obtained and validated our GT by manually transcribing 104 pages from different patent files (thanks, Amalya!). These files were selected to represent variations in language, layout and quality of the source material. We then ran ABBYY's OCR technology and calculated WER (13%) and CER (8%). This translates to a word-level accuracy of 87% and character-level accuracy of 92% - decent enough results for documents nearly a century old and without utilizing any image preprocessing.