Create PDF Documents With Searchable Text from Google Chrome and Microsoft Edge

Win2PDF now has a feature that allows you to print documents that would normally contain non-searchable text to PDF files with searchable text.

Why this feature? When would you use it?  Well, there is one area in particular where this is useful, and that’s when it comes to printing PDF files from Google’s Chrome web browser, Microsoft’s newest Edge browser, or from other Google apps like Docs. Due to the way Google and Microsoft have developed their browsers and apps, printing from these programs creates PDF files that are image-based and not-searchable (or selectable) as actual text. When documents or web pages are printed to a paper printer, this isn’t noticeable or an issue. However it is a problem if you are using Win2PDF or another PDF printer since the files will be larger, non-searchable, and non-selectable.

We’ve solved this problem by adding a new save format called “Portable Document Format – Searchable (OCR PDF)”. When you use this save option when printing from Chrome, Edge, or Google Docs, the resulting PDF file will contain searchable text. It applies Optical Character Recognition (OCR) to the file and converts the image-based text into searchable text automatically.

This has been frequently reported to our Win2PDF help desk as a problem for users and prior to this feature we had to explain a multi-step process to get the desired results. Now, it’s just a single save like it would be from any other application.

This feature is still in our pre-release testing phase, but we want users to try this and give us some feedback. To try this feature, please do the following:

    1. Download and install Win2PDF 10.0.78 (or higher). This version can be downloaded from the Win2PDF 10 Update section of our knowledgebase.
    2. Download and install the Win2PDF Desktop with OCR Download.
    3. After you install the separate Win2PDF Desktop with OCR package, Win2PDF displays an extra save as type labeled “Portable Document Format – Searchable (OCR PDF)

While this is useful when you are creating new PDFs from Chrome or Edge, what about existing files that had previously been saved as image only, or that you received as email? Is there a way to “fix” those so that they are searchable?

Yes. Just open the original PDF in the Win2PDF Desktop App and Select Export  -> PDF – Searchable (OCR) from the File menu.

make-pdf-searchable-menu

The Searchable OCR PDF is only available in our pre-release software and we’re working on improvements, but give it a try and if you have any feedback or issues, let us know by sending an email to [email protected] or opening a ticket at our Helpdesk page.

Researchers Say PDFs Are ‘Unfit for Human Consumption’

We just stumbled across this Vice article titled Researchers Say PDFs Are ‘Unfit for Human Consumption’. It references a new paper published by the Nielson Norman Group outlining the problems with the PDF format that still exist.

“The format is intended and optimized for print. It’s inherently inaccessible, unpleasant to read, and cumbersome to navigate online. Neither time nor changes in user behavior have softened our evidence-based stance on this subject,” the article reads. “Even 20 years later, PDFs are still unfit for human consumption in the digital space.”

Ouch!

While it is an interesting read and does outline some very real limitations of the Portable Document Format as well as strategies to make them more user-friendly, it is, after all, a document format. The primary function of PDF is to make files universally available on all platforms and to preserve the formatting and layout of the original documents.

Court filings, for example, require a consistent and universally accepted standard for submitting electronic documents. Most companies require standardization of company forms across their business practices. Government agencies like the IRS need standard forms and documents for processing. So do hospitals and clinics working with patient medical records. Many electronic texts require a specific layout of images and text in order to be understood correctly and in context. So many examples in just about every industry. And to accomplish this, you really need a standards-based document format.

Having said that, it’s certainly appropriate to make some information available in other formats, especially if the information needs to be dynamically formatted to different sized screens and for different users, but it’s hard to fault PDF because it doesn’t work for all users in all situations.

Also, it should be noted that there really aren’t any viable alternatives to the Adobe PDF for enterprise users where these types of considerations are paramount. Microsoft did try to gain support for its XML Paper Specification (XPS) but it never took hold as a replacement to PDF.

While PDF files do have limitations, especially for users reading the files on small screens like phones or tablets, they still provide the best technology for creating, archiving, and sharing electronic documents. Adobe’s blog gives many reasons why PDFs are better than other proprietary formats.

How to Create a Non-Searchable PDF File

When you create a PDF file from most applications, the result is a PDF that contains both text and images. The text can be searched from PDF viewers like Adobe Reader, can be cut & pasted into other documents, and it can also be indexed and searched by search engines like Google or Bing.

However, some people want to create PDF files that are NOT searchable for a variety of reasons.

We posted an example some time ago where some sensitive documents were redacted in the PDF, and even though they displayed correctly (where the text appeared blacked-out), the actual text in the PDF file was searchable and selectable. Whoops!

There are also situations where lawyers litigating cases need to share documents with the opposing side, and they have an interest in dumbing down the PDF file. That is, making it very difficult to search through the documents.

Whatever the reason, the easiest way to create non-searchable PDF files is to use the PDF Image Only file save option with Win2PDF. This will save all text in the document being printed as an image, so that it can’t be searched or indexed by search engines. You can save the output as either a monochrome image, or a color image depending on your needs.

One caveat with this feature is that it will make the file sizes larger, which is usually not desirable.

Unless, that is, you’re a lawyer litigating a case…

Win2PDF Release 10.0.72 Now Available

A new version of Win2PDF (a free update to existing Win2PDF 7 or Win2PDF 10 users) is available now at the Win2PDF download page. In addition to bug fixes and stability improvements, it adds several new features to Auto-Naming PDF files, command-line options, and the Win2PDF Desktop App. Win2PDF 10.0.72 includes the following new features:

  • Adds a Configure Win2PDF Auto-Name shortcut to the start menu to make configuring the Auto-name features easier.

  • Adds Send File and Print File actions to the Configure Win2PDF Auto-Name menu to automatically email or print a PDF. See How to Automatically name and send PDF files by email for an example of automatically naming and sending invoices to email recipients based on the contents of the PDF.

  • Performance and stability improvements

Again, if you have a license for Win2PDF 7 or Win2PDF 10, you can download this new version at no charge.

Can Win2PDF be used as a Shared Printer?

We get this question a lot, and it’s worth discussing.

First, what do we mean by a shared printer?

Typically, a shared printer refers to a paper printer that is attached to a print server and then shared to multiple workstations on the network. It is very useful for a paper printer because it allows one physical resource (the printer) to be accessed and utilized by many users. Each workstation can print files, and then that printer file data is sent to the shared printer where it is queued up with other print jobs until each file is printed.

If you’ve ever worked in an office setting with many users sharing a printer, you’re probably familiar with the line of people waiting for their printouts because ‘that coworker’ sent a 200-page manual ahead of you. It has its drawbacks.

The primary advantage, of course, is cost savings since one printer can be utilized by many users.

So, back to the original question: Can Win2PDF be used as a shared printer?

The short answer is it can be used as a shared printer, with limitations, in some configurations, although we don’t officially support it. The reasons for this are likely true for just about any virtual printer. Win2PDF, like most virtual printers, simply works differently than a paper printer:

  1. There is no cost savings as in the case with the paper printer. The licensing for Win2PDF is ‘per workstation’, and the Win2PDF TSE license does not allow you to share the Win2PDF printer to an unlimited number of workstations. One license is needed for each computer, whether it is using the locally installed Win2PDF printer, or accessing the Win2PDF printer over the network. The price is the same in either configuration.
  2. There is no performance advantage like a paper printer might have. A shared paper printer may accept print jobs from multiple workstations, and then do further processing on the printer. But this isn’t the way Win2PDF works. Since Win2PDF converts files directly to PDF, it doesn’t offload any of the processing capability to the print server. The workstation does all of the processing.
  3. Win2PDF relies on some Windows operating system components to work correctly. When you run the Win2PDF setup program, the setup program checks for the existence of these needed Windows components, and then installs them if necessary. If Win2PDF is only installed on a print server, and then shared over the network, this check is not performed. Other workstations on the network may be able to access the shared Win2PDF printer, but the printing may fail if their specific machine is missing components.

When it comes to virtual printers, since there are no inherent cost benefits or processing speed gains, what most customers really want is an easy to deploy and manage PDF solution for a large number of users.

And that can be addressed in other ways that make more sense for a virtual printer. For larger numbers of users, we provide a volume license installer that can be used to push licensed copies to each workstation. And that, coupled with our silent install options, makes it easy to get Win2PDF on each desktop so that users can create PDF files hassle-free.

More details on this issue can be found at our online support FAQ.

How to automatically name and send PDF files by email

Here is a more advanced example of using our new content-based naming feature in Win2PDF.

Suppose you want to name a PDF file based on some value within the PDF file, and then email the PDF file to an email address that also resides within the PDF file. This could be useful if a customer wanted to send out invoices to customers based on an email field in the invoice, or to send a report unique to an individual client based on their specific information.

The latest version of Win2PDF 10.0.71 includes new features that allows you to search for up to three different values within the PDF file, and then use those values for Auto-naming and for sending the email.

Let’s show this feature using the following example: Here is a sample invoice from an ERP system. It includes both an invoice number (which we’ll use for the naming of the PDF file) and it contains an email address for each invoice (which we’ll use for addressing the email).

We’ll start by opening the sample file of the invoice in the Win2PDF Desktop app. Once opened, you can use the Auto-Name ➜ Define Auto-Name Search Field menu to define up to 3 search fields.

We’ll define two of the search words as follows:

“Search Field 1” = “Invoice #:”
“Search Field 2” = “CUSTOMER EMAIL:”

When you select the Define Auto-Name Search Field, you’ll get a window where you can enter the text you wish to search for (see image below). The value immediately following this search field will be used in the configuration screen.

After you have defined the search fields, Win2PDF will display the values of the search in the current document as a confirmation.

Using our sample document as a reference, the values (the text immediately following) for the search fields are:

“Search Field 1” value = “01357”
“Search Field 2” value = “[email protected]

These values can and likely will change for each document being printed.

Next, when you go to the Auto-Name ➜ Configure Auto-Name… menu, these two search fields will the be used for the file name, and for the email recipient.

Now when you print an invoice to Win2PDF from the ERP system, the files will be automatically named and attached to a new email message as shown below.

You could take this one step further and set up these PDF files to be sent automatically using our Win2PDF Mail Helper add-on application.

There are many options available for naming and emailing PDF files automatically — more than we can reasonably cover in this post. So, if this does sound useful and you need help configuring this for your own reporting applications, let us know and we’d be happy to assist you setting this up for your needs.

Win2PDF Report Server for Legacy Reporting Applications

We’ve recently added new capabilities to our Win2PDF Terminal Server Edition (TSE) software that allows it to be used with legacy applications that were originally designed to create paper reports.

What do we mean by legacy reporting applications? In this context, what we are referring to is a class of (typically) older software programs that sent text or special printing data directly to a dedicated paper printer. Essentially, these reporting applications would stream the data directly to a laser printer connected to the local area network, and the paper printer would print whatever information was sent.

“Now more than ever, with employees working remotely or from home, it’s important for companies to adapt their interfaces to legacy programs so that the reports can be captured and shared electronically as PDF files.”

The new Win2PDF Report Server component of Win2PDF TSE does the following:

  1. Creates a dedicated copy of the Win2PDF printer named Win2PDF Report Server
  2. Installs a Windows AppSocket Service that uses the AppSocket printing protocol, also known as Port 9100, JetDirect, or RAW printing mode. The service accepts print data from a legacy application and then routes it to the Win2PDF Report Server application installed on the server*.
  3. Installs a console monitor program called the Win2PDF Report Server. This program is added to the startup folder and takes data routed from the Win2PDF Appsocket Server and converts it to PDF using the Win2PDF Report Server printer.

* Note: The Win2PDF Report Server has the capability to support additional printing protocols based on need. If you have an application that relies on a protocol other than AppSocket, let us know and we can provide more information.

The Win2PDF Report Server currently accepts data in text or Printer Control Language (PCL) format, and the created PDF can be named and saved on the customer’s network using any of the numerous Win2PDF Auto-name options.

If you have a need for this type of PDF reporting solution for legacy applications, contact [email protected] for more details.

Content-Based File Naming with Win2PDF

One new capability of the Win2PDF Auto-name feature is the ability to define and save PDF files based on defined content that resides “within” the document to be printed.

For example, suppose you have a reporting application that spits out a report, and you want to name this report based on the document number, or invoice number, or customer name, or some other value that exists inside of the report.

There are a couple of different ways to set the Content-Based File Naming within Win2PDF.

The first method is to use a search word.  Using this option, you would define a word to search for in the document (like document # or invoice # or customer name), and then the software would use the following word or set of characters to be used in the file name of the PDF file.

The second method is to use a defined content field.  Using this option, you would define a particular fixed location on the page where the content field always exists, and this would be used in the file name of the PDF file.  The following animation shows the basic process, but more detailed instructions can be found at the content field section of our online user guide.

Content based naming 1

How would you know which method to use?  Well, the search word option works when you know the content field value always follows a particular word, but the exact position on the page or length of this value may be variable.  The content field option works when the content field value always exists at the exact same location on the page.

Grabbing the name from a printed purchase order may use the search word option, since the length of the purchase order number may change from customer to customer.  On the other hand, grabbing a Tax ID number from an IRS form may use the content field option, since the number and location on the form will always be fixed.

As more customers are working to digitize their reporting functions (without any user interaction), we feel like this content-based file naming will become an increasingly popular tool for automating reports.

Using Auto-name to Save Reports to PDF

As the Coronavirus problem persists and more and more employees work from home, we’re getting questions from customers who are trying to adapt reporting applications to generate PDFs. Specifically, they are looking at ways of automatically saving these PDF reports without any user intervention.

The easiest way to do this is to enable our Auto-name Files feature. It can be found by clicking on the PDF Options… button on the main Win2PDF file save window as shown below.

Here are a few tips on using the Auto-name Files feature:

  • There are a variety of predefined options to use which allow you to include the document title, date, and time in the naming of the file. However, there is also a User Defined option which gives you much more flexibility. You can use any number of variables to make a customized file name based on document titles, as well as the time and date.
  • Not only can you use these User Defined variables to customize the file name, but you can also customize the folder(s) you are trying to save to. For example, if you used the following variables as the User Defined name:
%PDFYear%\%PDFMonth%\SalesReport-%PDFMonth%-%PDFDay%-%PDFYear%.pdf

You’d end up with the following results whenever you saved the PDF file.

[Default Save-to folder] \ [Current year] \ [Current Month] \ SalesReport-MM-DD-YYYY.pdf

In other words, folders will automatically be created (if they are not already present) for the year and month, and the appropriate sales report for the day will be placed in each location.

  • You can also create multiple Win2PDF printers and set auto-name for specific departments/people. For example, suppose you want to keep the normal Win2PDF printer for use with day-to-day PDF creation, but you want to make a copy named “Reports from Win2PDF” that will be used to generate PDF files from a specific application (without any user interaction). You’d simply need to
    1. Add a copy of the Win2PDF Printer and name it “ Reports to Win2PDF”
    2. Turn on the Auto-name feature (as discussed earlier in this post) for this new copy of Win2PDF
    3. You can repeat steps 1-2 to create any number of specialized Win2PDF printers that save PDF files automatically, with names and locations defined by an application, department, location, etc.
  • You can also apply Auto-Name settings to all users. To do this:
    1. Configure Auto-Name in PDF Options…
    2. Open the Win2PDF Admin Utility. The file name definition shows up on the File Name tab.

    1. Check Apply to all users
    2. Click Apply, and Auto-Name will apply to all users

The Auto-name feature is very powerful.  If you have reports you want generated on a regular basis, and you want to define your file names and eliminate any input from the user, use these tips and let us know if you have any questions.

Working From Home Using Terminal Servers

Since the outbreak of the coronavirus and COVID-19, we’re seeing more interest in the use of terminal servers in many enterprise environments. This article from Citrix, for example, gives an overview of this shift in workplace culture that many companies are now exploring.

If you’re not familiar with the term, a terminal server is essentially a central server that hosts applications, files and shared resources like printers (virtual or physical), and then shares these programs and resources over a local network or over the internet to “terminals” (sometimes called “thin clients”). Since the applications are all loaded on just one server, it allows remote users (in this case we think of work-from-home terminals or computers) to run the business programs they need without having anything required on the local machine.

One enormous advantage to this type of solution is that the remote work-at-home “terminals” can be practically anything — home PCs, Windows laptops, Apple iMacs or Macbooks, iPads, Tablets, etc. Each work-at-home terminal would simply run applications from the server and be able to print on the server. The business doesn’t need to worry about each individual client’s hardware, operating system, or local program availability.

This is where our Win2PDF Terminal Server Edition (TSE) comes into play, as a way to share and view documents remotely. Win2PDF TSE is the same as our desktop Win2PDF software (it has the exact same interface and features), but it has been adapted for a server-based, multi-user environment. It allows remote workers working from any work-at-home terminal to access company software programs and then create PDF files that can be saved on the server or on a local PC client, printed to a network drive or hard-copy printer, or e-mailed to a group of recipients. After you install Win2PDF TSE on the server, it is automatically available as an available printer to all published applications on the server.

There are a variety of solutions that handle this type of terminal server deployment that are available from companies like Microsoft, Citrix, and many others. They use slightly different terms (i.e., Microsoft calls terminal services “Remote Desktop Services” and terminal servers “Remote Desktop Session Hosts“) and product names, but they work in the same general way.

If your company is considering Terminal Server deployments and wondering about PDF solutions for remote workers, download the trial version of Win2PDF TSE and let us know if you have any questions. Win2PDF TSE is licensed ‘per server’, and each server can support unlimited numbers of users or clients at no additional cost. Volume pricing for multiple servers or server farms is available as well.