Descaling Organizations for Scaling Agile – Top takeaways from Agile 2014 Conference

In the Agile 2014 conference we heard many speakers echo the same signal – “Lets descale the organization instead of scaling agile”. There’s an increasing realization that instituting new models like scrum of scrums to force fit agile processes to corporate hierarchies isn’t going to work. Models are rigid and linear whereas human systems are non-linear and need more flexibility. We need to take a hard look at restructuring our organizations to become less siloed and hierarchical.

Its not the frameworks, its the culture

Its not the frameworks, its the culture

Many frameworks are proposed. The fact that there are so many makes the magnitude of the challenge evident.

  • Scrum-of-Scrum (SoS)
  • Large Scale Scrum (LeSS, Larman/Vodde)
  • Scaled Agile Framework (SAFe, Leffingwell)
  • Disciplined Agile Delivery (DAD, Ambler/Lines)
  • Spotify “Model” (Kniberg)
  • Scrum at Scale (Sutherland, scruminc.com), meta framework

The Agile 2014 conference can be viewed as the watershed year where everyone agreed that its not about framework – Its about Culture.

Descale your organization

Change in culture needs enterprises to stop paying politics tax. Politics tax is defined as the time spent on CYA. We need to allow people to choose to unleash their potential and stop fearing failure. Fear results in lies and lies result in mistrust. We need to build fail safe relationships.

Olaf Lewitz in his passionate appeal proposed a very simple manifesto “We value people”

Rounabouts Vs Traffic Signals

In an earlier posting I had mentioned that we need distributed decision-making in more decentralized and flatter structures to deal with rising Business VUCA  . Bjarte Bogsnes echoed the same thoughts. We need Theory Y organizations to deal with business VUCA. We need to trust our people instead of controlling them with rules and policies. Rules and policies require monitoring and enforcement. Like traffic signals- a rule based system is often inefficient. Roundabouts are self regulating value based system. A roundabout trusts users and demands self-discipline and its scalable.

Budgets force decisions to be taken too high up and too early. Budgets often lead to people spending money even when there’s no need just because they have the budget. Some of them do so to avoid losing the budget next year if they don’t spend it this year. Expense allocation should be need based. We should trust people to spend only what is needed and not have them ask for a budget. Read more about Bjarte Bogsnes’ presentation here.

Squads,Tribes,Chapters and Guilds

Spotify offers a fascinating model to grow without losing the benefits of being small. Spotify has kept an agile mindset despite having scaled to over 30 teams across 3 cities. This model has gained a lot of popularity in the agile community.

The nomenclature “squads,tribes,chapters and guilds” indicates the resemblance of the proposed structure more with communities than corporations. We have already seen the advantages offered by communities over corporations in this earlier posting.

You can read more about the Spotify model here

 

Self Organizing Organizations

Success with Fedex Day got Trade Me a New Zealand company with 80 people in their engineering team thinking what if we allowed people to organize themselves and do what they wanted to do all the time? Further inspired by the Spotify model  they decided to let people choose what they wanted to do. They had product owners define 11 work streams for which 11 Squads were formed by allowing employees to choose the squad they wanted to join. They limited the size of each squad from 3 to 7 members . They also ensured that each squad was required to be self sufficient and co-located.

This unique social experiment inspired and empowered people to self-select. This resulted in stable, focused teams that delivered better. People were happy, as they got an opportunity to do what they wanted with whom they wanted. Read more about this experiment here.

T Shaped Skills

Decentralization and distributed decision-making demands cross functional skills. These changes also demand new skills as covered in this earlier posting. Augusto Evangelisti shared his experience building an extremely cross-functional team with each member acquiring skills in all functions. T shaped skills characterized by deep skills in one functional area like QA and breadth of knowledge in other functional areas like development, product management and operations. For cross skilling; communication, curiosity, respect and empathy are more important than technical skills. No one works alone in these teams. E.g. at least three people work on writing a user story. Shared activities made people more accountable and resulted in better quality -0 bugs ,3X efficiency, shared understanding of goals and mutual trust.

Mob Programming,

Woody Zuil shared his experience at Hunter Industries where he is running their engineering team using what he calls as the mob programming model for the last 3 years. The team members self selected as they were attracted to work together due to awesomeness of the vision. The whole team always worked on the same task whether it was a development , testing or deployment task. That way developers got to think like testers and testers got to think like IT operations. Overall interaction improved which brought in more kindness, consideration and respect. No one was blocked waiting for inputs as all the concerned members were in the same room. There was only one computer and two projectors. The person at the computer worked as the driver and all others worked as navigators. You can see a video of the team here.

The fact that Hunter Industries is supporting this model makes it evident that it must be working. Best requirements, architecture and design emerge from self organizing teams. There’s no duplication of code resulting from two developers independently working on similar code in their cubicles. No duplication means less code and low technical debt.

DevOps is a cultural change

In a session by Pete Cheslock the same thoughts were reiterated in a different context. DevOps often gets mistaken for a practice that can be implemented by appointing a few “DevOps Engineers” and buying a set of tools to do continuous integration and delivery. The bigger point gets missed out. DevOps is a cultural and professional movement. You can’t solve social and cultural issues with tools. It’s a journey. No need to hurry to get everyone there today. We need to get everyone’s buy in . It’s a slow process. We have to actively cultivate trust and learning. We have to allow teams to fail and learn. We need to march with a conviction that the direction is right.

Catch-22 of Hiring

There is an inherent conflict at the point of basic information acquisition in the process of hiring. The question is how much information should the candidate be required to fill up while uploading his resume? Too much information increases his/her work. On the other hand if minimal information is acquired ; hiring managers are left with a whole lot of resumes and very little information. Its frustrating for hiring managers to read a number of unsuitable resumes before getting one that is suitable.

Catch-22 Situation in Hiring

Catch-22 Situation in Hiring

To elaborate this situation let me take the example of my company. We were getting hardly any interesting resumes from our website. We decided to do away with the lengthy process and made it very simple. Now we have an apply button in front of each opening on the careers page; all that a candidate needs to do is to upload his latest resume. But simplifying this process resulted in a whole lot of resumes being uploaded. Now our HR executives are spending significant amount of their time managing resumes. Our hiring funnel in the chart below shows more than 99% of resumes being filtered out to make less than 1% offers.

Talent Acquisition Funnel

Talent Acquisition Funnel

Should we switch back to our old “elaborate” process? Will the “elaborate” process and form filling ensure that hiring managers get what they want? The reality as we learnt from our experience is quite the opposite. Really interesting candidates don’t bother to go through the ordeal of “registering” and uploading their resumes . And those who do are not really interesting and those who appear to be interesting are just that. They “hype up” their resumes to make themselves appear interesting.

I had this problem on my mind when I attended a day long event  focused on applications of machine learning

Inspiration

Of all the talks I was most inspired by this talk by Nilesh Phadke of BMC Software. He demonstrated applications developed for the IT support – far away from the world of hiring. However I felt that the problem I had on my mind could be solved by applying the machine learning approach.

Information Extraction for Filling up Forms

To automate any workflow ; one needs to enter long forms about entities – be it a support ticket or new candidate for a company. Long forms demotivate users and introduce an element of delay. There also is a tendency to skip non-mandatory fields even though the information is available.

Nilesh demonstrated “Formless Incident Creation” where the user was allowed to type a complaint in one Text Field. As he filled in details, based on the words that he was typing the fuzzy matching algorithm in real time matched the correct entities to those words. Not only did it complete the form needed but also searched and found similar past incidents from a myriad of templates of typical incidents.

Machine Learning Approach to Catch-22 of Hiring

I was immediately reminded of the catch-22 situation in hiring. Most of the information is already present in the resume that is being submitted. Can we not use information extraction algorithm to automate this tedious task? Can we also have different templates for Developers, QA Engineers, IT Support Engineers and Project Managers? Can we use the information extracted to synthesize micro-resumes or short summaries of less than 500 characters to help hiring managers quickly read resumes on their mobile devices?

This gave a new direction of thinking to resolve the catch-22 situation. Here are the set of tools that we plan to use.

Fuzzy Matching – Solr

Search- Lucene

Natural Language Processing – OpenNLP

Analytics for unstructured text-UIMA

Please stay tuned for updates on how we use machine learning for formless resume acquisition and more efficient search by hiring managers.

 

 

Ranking Resumes using Machine Learning

In a recent article we saw how ranking resumes can help us keep the WIP within limit to improve efficiency. We also saw an interesting way of achieving this is by playing a mobile game. In this article we will see how machine learning can be applied to rank resumes.

Disclaimer

This article covers a “Quick and Dirty” way to get started. This is no way the ultimate machine learning solution to the resume ranking problem. What I did here took me less than a day of programming. This could serve as an example for students of machine learning.

Problem Formulation

We train the machine learning program by using a “training set” of resumes which are pre-screened by a human expert. The resume ranking problem can be seen as a simple classification problem. We are classifying resumes into suitable (y=1) or unsuitable (y=0).  We know from machine learning theory that classification problems are solved by using the logistic regression algorithm.

Sigmoid for Resume

Sigmoid Function Showing Probability of a Resume being Suitable

We know the predictor function represents a value that lies between 0 and 1 as shown in the diagram above. The predictor or hypothesis function hθ(X) is expressed as-

hθ(X)=1/1+e-Z where z= θTX

where X is a vector of various features like experience(x1), education(x2), skills(x3), expected compensation(x4) etc. which decide if a resume is suitable or not suitable. The first feature x0 is always equal to 1.

Features & Parameters

Features & Parameters

hθ(X) can also be interpreted as the probability of the resume being suitable for  given X and θ. So the resume ranking problem is essentially solved by evaluating the function hθ(X) with the resume yielding highest value of hθ(X) getting the top rank.

With this prior knowledge of machine learning and logistic regression we have to find θ by studying a training set of resumes some of which were selected to be suitable -remaining ones being unsuitable.

Simplification of the problem

To further simplify the problem let us not bother about all the attributes like experience, education , skills, expected compensation, notice period etc. while ranking the resumes. As we saw in this earlier post ; we need to worry only about the top constraints. We selected the top constraints as those constraints which address “must have” features that are “hard to find”. Another benefit of limiting  these top constraints is that the same can be quickly and easily evaluated by the recruiters in short telephonic conversations with the candidates. This makes the process more efficient as it precedes and serves as a filter before the preliminary interview by the technical panel.

Decision Boundary

Training set is a set of resumes that are already known to be suitable or not suitable based on past decisions taken by the recruiters or hiring managers. Let us plot the training set for a particular opening based on past records. For the purpose of this article let us say that resumes are ranked only on the basis of 2 top constraints viz. relevant experience (x1) expressed in number of years and expected gross compensation per month(x2). The plot would look somewhat like what we see below.

Decision Boundary

Decision Boundary

 

If you draw a 450 line cutting the X1 axis at X1=3, the same can be seen dividing the training set so that every point below the line represents a suitable resume and every point above it represents an unsuitable one. This line is machine learning terms is called the decision boundary. We can say that all the points on this line represent resumes where probability of them being suitable is 0.5.  This is also the point where z=0 as we have seen in the diagram above  showing the sigmoid function-

hθ(X)=1/1+e-(-3+x1-x2)=0.5

This equation represents a point on the sigmoid function where

Z=0   – replacing Z with θT X

θT X= 0

-3+X1+X2=0  – represents the Decision Boundary

Gradient Descent 

Though we have visually plotted the decision boundary ; it may not be the best fit for the training set data. To get the best fit we can use gradient descent to minimize the error represented by the following equation-

 J(θ)=-1/m[i=1m y(i)*log(hθ(x(i)) )– (1-y(i))*log(1- hθ(x(i)))]

– where m is the number of instances in the training set and X(i) is a vector representing x0,x1,x2 for the ith instance in the training set of resumes. y(i) takes value 1 if the ith instance was suitable and 0 otherwise. Here we are trying to minimize the function J(θ) by finding out a value of θ that minimizes the error function. Here θ is a vector of θ0, θ1 and θ2.

We can minimize J(θ) by iteratively replacing θ with new values as follows. Each iteration is  step of length α is for descending down the slope till we reach the minimum where the slope is zero.

θj:= θj-α(i=1m(hθ(x(i))- y(i))* x(i))

Implementation

We wrote the code to execute this in octave – as it’s a known bug-free implementation of machine learning algorithms and vector algebra. There are libraries available in Python and Java to build a more robust “production grade” implementation.

Limitations and roadmap for further work

The logistic regression algorithm is useful only if you have a reasonably large training set – at least 25 to 30 resumes. We also need to have the same selection criteria for the algorithms to work – hence you can’t reuse training sets across different job positions.  There are some “niche” positions where its impossible to find enough resumes- its both difficult and unnecessary to implement machine learning in such cases.

There are many “to-dos” before this program can be made useful. We need to use more features – particularly those which are “Must Have” types. We also need to have more iterations of the gradient descent with different values of  α . Lastly we need to have more resumes in the learning set to be able to further break it down into training set, validation set and test set.

Conclusion

Its particularly challenging to rank 20 or more resumes even though the ranking is based only on 2 or 3 attributes. Recruiters often skip this step as it tends to be tedious and end up wasting a lot of hiring managers’ time. Its an error prone process if a junior recruiter is assigned the task.By automating resume ranking, we hope to avoid human error. We also hope to get early feedback and improved understanding of important attributes or top constraints by limiting the short list to top 3 resumes. Lastly it takes a few seconds for this crude Machine Learning program to rank 20 resumes- something that would take 10 minutes for an experienced recruiter.

 

Social Physics Applied to Hiring

In this blog I will explore how the fundamentals of social physics  authored by Alex Pentland  can be applied to hiring. We will explore how these fundamentals help you  not only to target the right candidates but also  to reward and motivate recruiters.

Evaluating Candidates’ Social Media Presence

Image

Candidates who have more connections and are members of more groups are more likely to explore and acquire new ideas. Idea flow happens in diverse networks. Idea flow happens more by overhearing surrounding conversations than by 1:1 communication. Even though some one is not actively posting messages on a social network s/he is learning a lot by eavesdropping on the conversations that are happening.

Harvesting Groups for Potential Candidates

Image

We need to exercise judgment while selecting groups to harvest for potential candidates. Social groups that have more frequent short conversations are better places to look for innovative ideas flowing than groups that are having long but less frequent communication. Also groups where people with diverse backgrounds participate in these short interactions are better than groups where a handful of people lead and drive most conversations – some of them tend to become monologues or an echo chamber.

Like organizations ; each groups has a culture. Some are marketing facades for promoting commercial interest of the group administrators at the other end there are lively groups where valuable ideas are flowing in engaging conversations about topics of common interest. As talent scouts we need to zero down and harvest prospective candidates from groups that have the right culture.

Reaching Out to Potential Candidates

Once you have identified the group worth harvesting; you need to start reaching out to individual candidates within the group. You might use your own network to find common contacts to get introduced. You must exercise judgment while selecting the contact through whom you are approaching the candidate. Not all connections are “trusted”. Each person has only a few “trusted” connections. Look at recommendations and other conversations for determining such a “trusted” connection.

Social Incentive for Recruiters

One of the important findings of social experiments conducted by Alex Pentland was that change of behavior can be brought about by frequent recommendations from “trusted” connections in a short period.

Depending on level of interaction between the influencing buddy and the influenced target Social network incentive scheme works almost four to eight times more efficiently than traditional individual incentive approach . If we can provide social incentives specially designed for a group for a limited time; the same can act to provide positive reinforcement needed for individuals to accept new ideas and change their behavior.

You can use tools like InMaps for LinkedIn to visualize your connections as groups of sub-networks of connections who are well connected with other connections. You can devise a social incentive especially for a sub-network and run a campaign for a limited time. This will create the required buzz and positive reinforcement at a high frequency.

Ranking Game for Recruiters

Social incentives combined with monetary incentives work better than purely monetary incentives to reward recruiters for sourcing the right candidate.  In an experiment (Red Balloon Experiment) better teamwork and quicker hiring happened for teams that were encouraged to share monetary incentives than teams where individuals were rewarded only for their effort. You not only get monetary reward but a social reward when you share your monetary incentive with your friends. That way you can get more friends involved in your mission.

We saw in my last blog posting how limiting the shortlist to top 3 candidates helps to make the hiring process more effective and efficient. We devised a mobile game and invited recruiters in our company to play it. This game shows short summary of 2 candidates on the mobile screen and asks the player to vote for one. The game goes on till you have played for all possible pairs. In the back end a sorting program bubbles up the top 3 candidates based on players’ votes. The recruiter who sourced the successful candidate shares his/her reward with others who voted for the candidate. And all the players share their scores indicating “successful votes” with their social networks. (Peer See Approach )

Advantages

  • Recruiters act as collaborators instead of adversaries
  • Those who consistently vote for successful candidates receive respect and recognition further motivating them to improve their selection skills.
  • Introduces playfulness that is an inherent component of creative teams.
  • Newbies learn from their mistakes by seeing how everyone else is voting.

 

Lean Hiring- An Experience Report

In an earlier post we saw how organizations can create an ecosystem to attract top notch talent. But that is a long term vision which will take time to implement. How do we address more immediate hiring needs? In this post I will try to apply the learning from Lean and Agile principles and practices like Kanban to the process of hiring.

Image

Let us first compare hiring with software development to find out similarities between the two

Similar Problems- Comparison Between Software Development and Hiring

  • We are not sure about the outcome when we start: Software evolves as new information comes in. Similarly job requirements change to accommodate new needs or bar is lowered for reasons of non-availability and urgency. Sometimes a professional is internally transferred from another project and the need to hire simply goes away.
  • Large Batch Sizes (A.K.A. Waterfall model of development): If you think of software development and hiring as workflows; often managers try to maximize utilization of resources at each step of the workflow by handing over work in large batch sizes. Its not unusual for an HR manager to source more resumes to improve the chances of finding the right candidate. This results in pile up of half done WIP before the bottleneck, which is wasteful. Theory Of Constraints  and Kanban address this problem by putting a WIP Limit on the size of the batch to be handed over at each step of the workflow. Using Kanban to manage hiring is not new.
  •  Delayed Feedback : Large batch sizes also result in longer iterations and delayed feedback resulting in wasted cycles of recruiters and developers working on what they think is needed which often turns out to be different. We load the hiring managers by arranging many interviews without asking feedback about the interviews that have already happened. We need to ask,learn from the feedback and use that learning to improve the quality of candidates in subsequent cycles.
  • Ambiguous requirements: Its not unusual to start developing a software product with some high level idea and a few whiteboard sketches. Similarly we often hear managers giving high level directives to hire “smart developers” or “kick-ass salesmen”.
  • Dynamic marketplace: Both software product and hiring opportunities are not permanent. They go away with changes due to technology, competition, new ideas and realization.
  • Waste resulting from unused code or resumes sourced: We often write more code than required. We often build more features thinking we are adding value. Similarly we often source too many resumes and interview too many candidates to improve our chances of finding the best match. Unused code and resumes represent the waste we should be attempting to minimize.
  • Vague acceptance criteria and definition of done: Software development and hiring can go on in perpetual loops because the end states are not well defined. Both software development and hiring reach states where doing more work would cost more than the value you get out of it. That’s when you should stop. There is no definition of 100% completion. But the good part is you can start using the software even if all its features are not yet implemented. Similarly you can start using a team that is not yet completely staffed. Best value is derived by prioritizing must have features in software or must have skills while hiring.

Here are some useful tips to make hiring process quicker and more efficient.  In the spirit of continuous experimentation we tried adopting the proven and well tested lean and agile practices to streamline hiring in my company.

Agile and Lean Principles Applied to Hiring

In an earlier post we had seen how doing painful things more frequently reduces the pain. We tried to make the hiring process less painful by doing the following –

  • Short iterations with quick feedback :  Long and “hyped up resumes” were consuming a lot of our recruiter’s and hiring manager’s time. We overcame this problem by using a 500 character micro-resume covering important facts including relevant skills, project experience, notice period and expected compensation. This semi-automatically generated micro-resume was made actionable with “detailed resume”, “accept”, “reject” and “call” links. Hiring managers were encouraged to view these on their mobile phones to provide quick feedback. The printed version of this micro resume also helped us populate the Kanban board.
  • Using the learning from the feedback : The recruiters asked the hiring managers to give a good idea of “must have” and “good to have” skills. Based on this information the recruiters shortlisted top 3 candidates whose micro-resumes were shared with the hiring managers via email and text message. We waited for the hiring manager’s response before sharing any more micro-resumes. One such iteration ideally got over within a day. At the end of the day we either had a shortlist of selected candidates or valuable learning that improved the next iteration. Due to the “anytime anywhere” nature of mobile phones; iterations were quicker where the hiring manager was more mobile savvy.
  • Small batch size: Instead of inundating the hiring manager with a number of resumes ; we put a limit of 3 as stated above– which forced the recruiters to do a lot of groundwork to select the top 3 resumes from a couple of dozen that would satisfy the selection criteria. Smaller batch size also forced the recruiter to do a lot of groundwork and research before ranking a candidate. As seen in an earlier post ; instead of relying only on the information in the applicant’s resume we leveraged additional information available in social media platforms such as LinkedIn, StackOverflow and GitHub to determine the ranks. This motivated the hiring managers to be more responsive as it reduced the number of pending cases needing their attention. They were also more willing to provide quick feedback to enable the recruiter to learn from it and provide better choice in the subsequent cycles. In fact in some cases the feedback came immediately as the hiring manager disagreed with the ranking given by the recruiter.
  •  Prioritization of requirements: While understanding the requirements we decided to check the most difficult constraints first. In the diagram below ; you can see the order in which we evaluated the constraints. This made the job of selecting top 3 resumes out of all the resumes relatively easy. This Job requirements matrix was filled in presence of the hiring manager. Limited space provided forced the hiring manager to think really hard before writing down the requirements in the appropriate space provided.

Image

  • Planning /prioritizing interviews: We always had the most suitable candidate in the backlog the next to be interviewed. Often its hard to get suitable time slots from good candidates, and recruiters end up scheduling a less suitable candidate ahead of more suitable one. We made it clear to the recruiters that Its not necessary to use all the time made available by the interviewers. Interviewer’s time is a scarce resource which needs to be utilized more judiciously. Moreover if the interviewer rejects the candidate; learning from rejection of a stronger candidate is more valuable than that from rejection of a weaker one. We always played our best card.
  • Timeboxing: Whatever happens one has to conclude the process at some point. Many times you don’t get exactly what you want but you must staff the position for business to carry on. Prioritizing and having a backup candidate in case the best candidate doesn’t show up are some of the precautionary measures one has to resort to under time pressure. Like a truly agile process we kept some of the unmet requirements for the next round of hiring and had a retrospective to formalize the learning from the previous round.
  • WIP Limit: Having too many candidates interviewed  results in a longer hiring cycle. It also results in inefficient use of interviewer’s time. The number in bracket under each step (as shown in the diagram at the top) on the Kanban board is the WIP Limit. E.g. we can’t have more than 3 candidates waiting for preliminary interview. We“pulled” a candidate from shortlist only after one of the three interviews happened and we got the feedback. This enabled us to learn from the feedback and to apply that learning to decide the next candidate .