Pharos Blueprint received data into its reporting database via three "on ramps":
- Blueprint Tracker/PrintScout exports from deployed workstations and print servers
- Imports from the included Pharos SiteMonitor application (versions 5.0 and newer)
- Terminal (Omega, Sentry, iMFP) data captured from Secure Release Here, copying, scanning, faxing (NOTE: copying, scanning, and faxing data is terminal and copier model-specific; not all capture all things)
Depending on your site configuration, you may not be getting data from one, or some combination, of the three. This Knowledge Base article describes the paths these data take to get into the Blueprint database, and where things can break. This blog will focus only on the Tracker/PrintScout path.
Blueprint Tracker/PrintScout Data
Author's Note: Rather than persist with the "slash" naming convention, since the client component has a different name depending on the version of Blueprint you have, I am going to simply call it "the Blueprint client" from here on out.
The Blueprint client wraps both local (USB, parallel, serial, self-hosted Standard TCP/IP, LPR, and file-system connections) and remote port monitors (Windows remote - for print server-hosted queues, Microsoft IPP, Novell NDS/iPrint, etc.) for the purposes of print job capture.
Getting Data from the Client
When to Connect
The data recorded from the print job is stored locally as an XML file (potentially encrypted) until one of the following occurs:
- It's maximum batch size, in number of stored XML files, is reached
- It is online and can send the files during its next "batch send" time.
Both of these are determined by settings on the parent Blueprint Collector, stored under Trackers (Blueprint 5.1 and lower)/Print Scouts (Blueprint 5.2 and newer) > Settings > Basic Settings (tab) > Print Job Batching:
By default, Blueprint Collector installs with the following configuration:
- Maximum batch size: 100
- Forward batches to Server starting between 8:00 PM and 2:00 AM
- Send jobs as soon as maximum batch size is reached: Disabled
How to Connect
Furthermore, the method of transport to the Collector is a pairing between the protocol selected within the Collector's configuration (using the Blueprint Server Configuration tool):
- None. The Collector only accepts HTTP (TCP port 80 by default) connections from clients.
- Optional. The Collector will accept either HTTP or HTTPS (TCP port 443 by default) connections from clients.
- Required. The Collector only accepts HTTPS connections from clients.
Based on the setting chosen, you have to install the client accordingly. This happens through the "/serverurl" switch specified when running the client installer. If you choose "/serverurl=collector.mycompany.com" or "/serverurl=http://collector.mycompany.com" then you will be telling the client to use HTTP. If you choose "/serverurl=https://collector.mycompany.com" then you will be telling the client to use HTTPS.
What Can Go Wrong?
When to Connect. The most common failure point is telling the client to send print job batches when the computer is turned off, sleeping, in hibernation, or disconnected from the network. In other words: the default Collector settings will probably not be the best lead-in. The thought process behind the default settings was noble: tell your client base to upload when, hopefully, the network and systems are not busy...or after business hours. But obviously, if you are turning off (or putting to sleep, or taking a laptop/Windows tablet home) the computer, the client won't be available when it is supposed to be doing its job. The problem continues to exacerbate because on start up, the Blueprint client will check, note that it missed its upload schedule, and then set up another time unless the start up time is greater than 24 hours after the expired send time. If there is a >24 hour gap, the client will, if it is able to connect to the Collector, queue up an "on demand" job upload. Assuming that this is a daily event, though, the new set time will be outside its "on" hour So what to do?
I like to discuss things with the stakeholders (normally the networking team) to help them understand what the impact is going to be. As an illustration, I'm going to work with the following numbers:
- Deployed base: 100,000 clients
- Average jobs per user per day: 20
- Average XML file size (job data): 6KB
So on a regular working day (9am to 5pm, wherever the user might happen to be), the total network impact is going to be 100000 x (20 x 6) = 12000000, or 11.4GB of data. That seems like one fantastic chunk of data, doesn't it? But let's look at email. I'm going to use my "Sent Items" folder from last week as the benchmark:
- Deployed base: 100,000 clients
- Average sent emails per day: 15
- Average email size: 50KB
I'm not going to go through the math; 50KB is way larger than 6KB, so the total data will be much larger than the Blueprint clients'...even with 5 less items per day in email. So it's normally fine to set the upload schedule to be during working hours. And this 11.4GB doesn't happen all at once. The client creates a random "batch send time" based on the Start/Stop time it's been given. That means that the 11.4GB is spread between the hours specified in Basic Settings, alongside the "maximum batch" threshold and its availability. So if the Start/Stop time is between 8:00 am and 6:00 pm, that's 10 hours of time for the randomized upload. And if the "maximum batch size" is set to something reasonable (75% of average; so 15) and enabled, the randomization of the client load would be even more complete.
How to Connect. There is a straightforward matrix of settings that support a successful client:server communications path:
And the more clever part comes in when managing the HTTP/SSL communications, since you have to ensure that the workstations and servers have the same certificate paths (and expiration dates) for error-free communications. Resolving "How to Connect" is a matter of ensuring that the server and client settings, and certificates if using SSL, match.
How To See "When to Connect" Go Wrong
Reporting is normally the first place to see a deficiency in client-based data. You see it when the MPS bill comes in for 1.4 million clicks for August, but your Blueprint report shows only 5,000 printed pages for the same month. Without going from workstation to workstation, the quickest place to look to see if Blueprint client is the lacking path is within the Blueprint Administrator on the Analyst. Expand the Tracker/PrintScout group and choose the Machines view. Each row you select shows a roll-up view at the bottom of the user interface that contains essential "client health" data:
The "Last Print Job Sent" value is the timestamp of the most recent print job the client was able to both capture and upload. Because the Analyst gets this data through a chain (Client > Collector > Analyst), this date will normally be some time in the near past, but hopefully not much older than 2 days or so. Some EXCEPTIONAL EVENTS will change the "Last Print Job Sent" times, making them much older:
- Some computers have users who do not, or cannot, print, so if you see entire groups not reporting data, you probably have a workflow getting in your way.
- People go on vacation or are otherwise away for long times, and when this happens, their computers go on vacation, too, so expect some seasonal variations to the theme.
- Teleworkers and traveling folks who are reliant on software VPN solutions to connect will most likely import their data infrequently, but when it imports it can be a bunch at once--I'll touch on how to handle this below.
The "Last Heard From" value is the last time the client established a health check with its Collector. Again, because of the import chain that is endured, the value here will have most likely occurred in the past. Exceptional events 2 and 3 from above can occur here, too. But a workstation that does not print will still send heartbeats to its Collector.
If your "Last Print Job Sent" for workstations is either really old or non-existent, but "Last Heard From" is recent, then you probably have a "I can't upload" issue. You can confirm this at the workstation layer by looking at the ProfilerControlTask.log file found in C:\ProgramData\PharosSystems\PrintScout\Logs or C:\ProgramData\PharosSystems\Blueprint\Logs. You will see a line like this:
[2016/09/08 09:10:32 PB10 TB38 i ProfilerControlTask] BatchSendTime has passed, using cached time = '2016-09-07 23:40:42'
Breaking apart the log line, the first date and time records when the log line was written, so the 8th of September at 09:10:32. The message for this event is "BatchSendTime has passed", which is the client recognizing that it missed its previously-set time to upload job data, which, according to the cached time, was the 7th of September at 11:40:42 pm. A bit further down, this line is present in the log:
[2016/09/08 09:11:41 PB10 T1640 d ProfilerControlTask] Batch send time has passed, but nothing to do so recalculating.
This is the message that the client is aware that the batch send time has passed, but it isn't quite 24 hours old (only 9.5 hours has passed). Then, a bit farther down (but not much), this line appears:
[2016/09/08 09:11:41 PB10 T1640 d ProfilerControlTask] -> CGlobalState::CalculateBatchSendTime
[2016/09/08 09:11:42 PB10 T1640 i ProfilerControlTask] BatchSendTime = '2016-09-08 22:40:04'
Which is the event to set the new BatchSendTime...in this case it will be on the 8th of September at 10:40:04.
Fixing the Problem
All that needs to happen to resolve this problem is to adjust the batch send time on the Collector(s) hosting clients. When each client refreshes its configuration from the Collector, those settings take immediate effect. The screenshot shared earlier in this blog is sufficient for sites of all shapes and sizes.
Handling VPN Workstations - The Special Group
The teleworker/traveler workstations that utilize VPN connectivity represent a challenge to how the Blueprint client behaves. First and foremost, these workstations are not typically online immediately after startup, meaning that the client's initial configuration event fails, causing it to use a cached configuration in order to function; a refreshed configuration will occur once the workstation is online. This is not inherently bad, which is why we keep a "last known good" configuration available, but it represents a group that will not receive any updates immediately. From a deployment standpoint, a common suggestion is to identify a dedicated Collector server for the VPN workstations so that they can be given a special batch loading configuration so that jobs are moved off those workstations more rapidly than workstations that "live" in the organization's network.
Within this special Collector, the goal is to set a very small "maximum batch" size and enforce it, and ensure that the upload time matches the local workstation's operating hours:
The settings above will ensure that the workstation sends its job data as soon as it is created, so very little backlog is created locally. And it does this well into the evening hours, but before Collectors generally start their nightly upload to the Analyst. This keeps the inbound job data fresh.