To Do: LinkSync (Complete)

/
Priority Project #
4 LinkSync #: 3
4.01

Scheduling the scrapes

  • Schedule the scrapes to run in batches (e.g. 20 at a time, once every 6 hours - i.e 80 a day -  to minimise the chance of LinkedIn security blocking the IP
    • I recognise that for  someone like me with 1.5k contacts, the process will take ~20 days, but that is ok. 
      • Perhaps email progress report (and a file with a subset of the users) after each run  with the updates so far  (once we have fully tested this)
  • I have created a linked-inexport@atts-systems.com email address (password Descartes99) for the email to be sent from
Complete
4.02

Cookie view

  • Add in the dashboard the list of cookies (was user_password)?   
  • Perhaps show in the user list if the cookies are not added
  • Create a help email for users to add their cookie
Complete
4.03

Search

  • Ability to build a search that they can save  (eg Re-run JPM XVA guy)
Complete
4.04

Search

  • Add a gender and languages spoken selector
Complete
4.05

Full Scrape:  Improvements

1. Add new languages  (DONE, right?)

  • You might have noticed that in he LinkedInConnections index  and in the SearchResults, I have consolidated the languages-spoken into a single column (as real estate is a premium), and I have also built “Languages spoken” into the search functionality
  • This means that I can handle any number of languages (without expanding columns and destroying the visual impact), so I no longer need to restrict languages outside the top 5 to ‘Other’.
  • Therefore when we scrape a profile, in the language section, can we add the new language into the Language entity, if it doesn’t exist already. 
    • I can manually add the 2-letter acronym and the flag manually later to maintain that, but capturing the new Language at scrape would be terrific.

2. Additional connections scrape  (DONE)

  • If you recall, I asked that when you were processing a full-profile scrape that we capture the ‘Basic’ scrape of the details of the suggested names, that are displayed on the right.      See below for an example.
  • This is a critical function in order for us to quickly build out our population set as fast as possible
  • On reflection, I think you are correct when you suggested we assign a ‘system-user’  as the owner of these ‘orphan’ connections, as ultimately we will want to perform the full scrape on this profile, and it will need a user to marry up against in them to trigger that.  Do you agree, or could the system-linkedin cooke just look for non-owned (orphaned) users. 
    • If we could do it without a system-owner it might make it easier to maintain the database ? Open to ideas
Complete
4.06

Profile suggestions

  • Build out a page that contains suggestions for a profile:: Connection Strength Score
    • Number and seniority distribution of connections
    • Cross-industry diversity
Complete
4.07

Languages

  • For every connection that doesn't have a languages section, we save a ‘null' results   
    • We shouldn't save the null case
    • https://linkedinexport.atts-systems.com/languages_spoken/index

Birthday

  • The birthday fields not being scraped (my profile has my birthday in it as an example)
    • Linkedin only gives day-month.    Save as dd-mmm-2000  (we will display only as dd-mmm)

Photo

  • Correct the mistake whereby a profile with no photo is allocated someone else's photo
Complete
4.08

Reports

  • Build Industry and Candidate Moves reports
Complete
4.09

Access control for the search functionality

  • Limit the access to the search facilities to those recruiters that have paid for the search functionality
  • For others, hide the name and link and export function (unless their connections) 
  • Pricing and licensing for recruiters. 

Recruiter engagement

  • Detail the recruiter value in pages online, but which are not available from the menu
Complete
4.10

Scrape

  • When a LinkedIn contact has something unusual in their name (eg the maiden name in brackets, or a same or PhD in brackets) then it fails to allocate the first name and last name successfully.  
    • Suggestion:  Take anything in between brackets and delete (including deleting the brackets)
      • See attachment for examples in my Connections list (6 out 1,222 ‘fail’)
  • Scraping of gender from linkedIn needs to be trimmed for it to be effective - returns “…….She/Her……”
    • Let's test it

Settings

  • The settings for MaxLinkedInContacts and LinkedInScrapeBatch don't seem to be effective.  We should use these fields to reduce the initial scrape to make testing quicker

Popup Message

  • The popup message appears each time your launch Index - “There are 1222 connections. The system pulls data in the batch of 20 per hour. It will take approximately 62 hours or 2 days and 14 hours to completely fetch all the data of 1222 connections. Our system will auto pull data in the interval of 1 hour. You can check status in every one hour. To continue please start the process.”

 

 

Complete
4.11

Recruiter firm 

  • Add a flag to determine if you would like to share connections across the firm fully or vertically only.  How to motivate non-admin users to participate -  
  • If vertically, need to identify the admin users at the firm
Complete
4.12

User - own profile

  • Ensure that the user's own profile is included in the scrape
  • Launch a page that gives them their own report.
Complete
4.13

Full Scrape:  Improvements

Trim the Employers 

  • When scraping, trim the name of the employer(s) to remove any leading/extra spaces (IS THIS DONE?)
    • The search function was struggling to handle cases where there were/were not additional ‘spaces’, so I figured that the best way to do that was to trim as far upstream as possible
Complete
4.14

LinkSync User view

  • Error due to PhoneAnalyzer
Complete