Binary search trees

Duration: 45 minutes
Ages 11 to 14: Lesson 1

Learning outcomes

Students will be able to:

Assess the efficiency of a binary search tree (BST) for searching.
Computational Thinking: Evaluation Computer Science: Algorithms
Compare the efficiency of a binary search tree (BST) with other approaches to searching.
Computational Thinking: Evaluation Computer Science: Algorithms
Organise and find data using a binary search tree (BST).
Computational Thinking: Algorithmic Thinking Computational Thinking: Decomposition Computational Thinking: Evaluation Computational Thinking: Logic Computer Science: Algorithms Literacy: Reading

Preparatory knowledge

Binary searching and sequential searching lessons.

Note: This activity is about Binary Search Trees, or BSTs. There is a previous activity about Binary Search, which uses a sorted list of values. Students often understandably confuse Binary Search and Binary Search Trees. Both are mentioned below, and although they have elements in common, they are distinct ideas. We’ll often use the abbreviation BST here to help to make it clear that we’re talking about the trees, and not just Binary Search.

Notes on resources

This activity requires a "binary search tree" (BST) to be marked out on as large a surface as possible, such as a playground. The BST consists of lines for the students to follow, and "nodes" (these are typically circles or cards) that have values written on the back of them. While the activity could be done on a tabletop or indoors, it's much better if it's in a hall or playground, marked out on the ground with chalk, painter's tape, or masking tape (because students will be creating their own layouts, chalk on pavement tends to be a lot easier than tape). The nodes should also be as large as possible. For a smaller indoor setup, paper cups, envelopes or sticky notes with numbers underneath are suitable (as for the binary search activity), but it's ideal if they are large, such as rubber squares, paper plates, or discarded CDs. The writing needs to be invisible when they are upside down. The left-hand picture below shows what they will look like. The right-hand picture shows the numbers to put under each disc, but these shouldn't be visible to the students. The lesson explains how to lay these out.

Binary search tree drawn on the ground in chalk. — An example of what we will be creating. Each binary search tree has a different shape depending on the data.

Key questions

When we organised the sequential and binary search, what would happen if we had new data (information) to add? What would we need to do to insert that data? Is one faster to add data to? Is one faster to find the data in?

Potential answers could include:

With a sequential search it's fine to add information to the end of the list since the order doesn't matter, but with binary search it needs to be inserted in the correct place to keep it in sorted order, so you'd need to find the place to insert it and move everything up. So with sequential search it's easy to add new data since the order doesn't matter, but it's slow to find things; whereas binary search is slow to add data but quick to find it. In this lesson we'll look at a method that is fast for adding and finding data.

Lesson starter

What are the names of the parts of a tree that you can see?

Possible answers are: trunk, branches, leaves... Explain that we’ll use some of these words to talk about a binary search tree.

Let’s look at an example of what happens in the physical world: If a cat or ball is stuck in a tree, and you want to describe where it is, starting at the base (root), could you guide the rescuer by saying which way to go at each fork in a branch? For example, below is a picture of a tree with a cat - using only "left" and "right", what are the directions you would give someone to climb up to where the cat is? Can you guide someone to anywhere in the tree just by giving them "left" and "right" directions?

Teaching observations

Particular parts of a tree that are technical terms used in this activity are the branches, leaves, and root; but students might also mention the trunk, bark, fruit and other features. You can probably use these terms without defining them when you look at BSTs; for example, in the exercise above with a real tree, it makes sense to talk about a “left branch” and “right branch”, and students probably won’t need these explained.
For the tree drawn, the instructions "right, right, left, right" will get to the cat; note that although the first branch looks like a decision between "left" and "straight", we want to get used to the idea of calling it "left" and "right". Some students might observe that not all trees have two-way branches, although a two-way fork is common; even something that looks like it is branching three ways can probably be seen as two two-way branches if you look carefully.

Two binary search trees. One with numbers in the nodes and one without.

Mark out the "tree" above on the ground. The numbers are shown for your convenience in the left picture, but they would be hidden (turning the disc upside down) as in the picture on the right when the students first see the tree. (Note that the colours and arrows aren't necessary, but can help to talk about the tree.)

Show the students the "tree" that you have marked out. (If you are confident with the activity, you can mark out a tree with more than 12 discs - the larger the better.)

If this represented a tree, where would the root be? (It's the single disc nearest you). Which discs do you think are called the leaves? (These are the disks at the top with no branches coming out of them).

Teaching observations

The term "root" is the technical term for where you start in a "binary search tree". It's called "binary" because each disc has at most two branches, "search" because we can use it to find things, and "tree" because it has branches. The technical word for the discs is "node". The term "branch" is used to refer to each of the lines where the tree splits (at each node), and a "leaf" is a node at the end of the tree. It's helpful to use these terms with the class so that they get used to them.

Searching the tree

Reveal what is under the disc at the root, which in the example has the number 40.
Explain that we're looking for the number 25 in this tree. There is a simple rule: everything down the left branch of a disc is smaller than the value shown, and everything down the right branch is larger.
Ask which branch the number 25 should be down. (It is smaller than 40, so it must be down the left branch, since everything smaller is down the left).
Turn the first disc back over (so that students don't easily remember what was on it), and follow the left branch, revealing the number under the next disc. In this case it is 23. Ask if 25 will be left or right from 23? (It is larger, so it will be down the right branch.)
Hide the 23 again, and reveal the disc down the right branch, which is 30. Ask if we should go left or right? (Left, because 25 is less than 30; students may also observe that there's no choice, but that isn't relevant, as you must go left if the number is smaller.)
Hide the 30, and reveal the disc down the left, which is 25. You have located the disc you were searching for.

Finding another number

Make sure all the discs are hidden, and return to the root.
This time ask the students for directions to find the number 55. (They should work out that this requires going right at 40, right at 44, right at 48, and left at 91.) Remember to flip each disc back after looking at it, to avoid students memorising the tree.

What if the number isn't there?

Ask students to search for the number 35. (They should flip over 40, 23, 30, but at 30 they will want to go right since 35 is larger than 30. There's no branch, so you can conclude that the number isn't in the tree, despite having looked at only 3 discs.)

Variations

Adding a value

To do this, just search for where it should be, and add it on a new disc in the correct direction from the last disc that you got to (the added disc will be a new "leaf"). For example, when you searched for the number 35 in the previous example, it could have been added on a new right branch from number 30.

Build your own tree

You can build your own tree with any numbers that you want to be able to search. You could either make up some numbers, or use something like the years of historic events. It’s best to avoid having the same value twice, but we explain how to accommodate this below.

To build a tree, the very first value that you add to the tree will be the root, so just place that data on the ground. It's a very simple "tree" at this stage, with just one value in it!! But it gets more interesting as you add more values. Once the root is put down, students can add each value as for "adding a value" above i.e. search for where it belongs, and when you get to the end of the path, add it as a branch from the disc that you got to. For example, suppose we're making a tree with the numbers 45, 23, 26 then 76. The tree would be built up as follows:

Constructing a binary search tree with the numbers 45, 23, 26 and 76.

Initially the tree will seem very simple with just 45 as the root, but it can grow rapidly as values are added. The shape will depend on the order that the numbers arrive in. Note that it pays to have the branches from the root drawn to be fairly wide, as the tree may need to be wider further down as more values are added. The branches needn’t be straight, as long as it’s obvious which is left and which is right.

Teaching observations

Students may ask what happens if two values are the same. This is something they could explore; if they do, here are some ideas that they are likely to come to:

If you allow duplicate values in the tree, you need to be consistent about whether an equal value is down the left or right branch (for example, you could make a rule that the right branch is greater than or equal to the value at the branch)
Or you could count the number of items at a node, so if you get a second one, you store a count of 2 at that node (this is an example of data associated with the value)
In some cases duplicates shouldn't occur; for example, if the value is an ID or account number, there shouldn't be two people with the number.

Build your own tree with other data

Instead of using numbers, we can put words in a tree. Discuss what "less than" would mean with words (students should observe that it could be dictionary order, so "less than" means earlier in the dictionary). Have a set of words written on the back of a card, or old CD (the words could be a spelling list).

Here's an example using animal names; words that are earlier in the alphabet are down the left branch, and vice versa. This is the final version; below are the instructions for building it up with the class.

An example of a full binary search tree with words as nodes.

Take a word randomly from a pile you have prepared, and use this as the first value that you add to the tree; or you could have students write down their favourite animal. The first word chosen becomes the root (in the example it is the word “elephant”).

Take the next word, to start building your binary search tree. As a class, decide if it’s a word that starts with a letter that is lower than “e” or higher than “e”. In this example it’s the word “cow”, so it comes before "elephant" i.e. to the left. Have a student draw a long branch from the root. (It’s important to have this drawn out on a shallow angle (around 20 degrees) to leave space for later. Then place the word down at the end of this branch.

The word 'cow' is added to the left of the root node.

Go back to the root word, then compare that with the next word (“dog” in the example). Is D higher or lower than E? It is lower, so go left, then compare “dog” with “cow”. Everyone should agree that it’s higher, so draw a long branch to the right.

The word 'dog' is added to the right of the 'cow' node.

Repeat this process as a class until students understand how to add a new word correctly.

Once they do, put students into groups of 3, where one person holds the word and makes the decisions, another person draws the branch once a decision is made, and the third person confirms it is correct. (This is important because if the binary search tree is built incorrectly, then it won’t be reliable). Once the the teams are ready, have them line up behind the root word and set teams off in 20 second intervals, until all words have been placed on the tree. Once the binary search tree is set up, then work through different scenarios, including:

Adding a new word - how many words can we add to our binary tree search? (There is no limit.)
Finding different words and counting how many nodes need to be checked. (Some will be near the root, but usually will be further down the tree.)
Can we reliably accept that if a word isn’t found in the binary search tree by following the left/right rule, that it isn’t there? (If it is built accurately, then you can be sure that it's not down another branch of the tree that wasn't explored; if students make a mistake placing a node, this could be a learning point, since that node probably can’t be found in future searches.)

Teaching observations

It's ideal to draw the tree without any branches overlapping, so students need to draw their branches with wide angles, rather than close together.

Use a much larger tree

Here's a large one that you could use! Rather than draw it, you could use the blank version, and allow students to point at a node to ask what its value is, which you read off your secret version with the numbers on. They won’t need to write anything down - at every step they are simply making a left/right decision. Note that it's important that every value down the left branch of a node is smaller than the value in the node, and every value down the right branch is larger.

Teacher's secret version

Note: this large tree can be recreated using these numbers in this order: 53, 86, 61, 27, 88, 30, 49, 23, 28, 55, 91, 12, 21, 72, 90, 32, 10, 99, 81, 60, 45, 20, 93, 97, 3, 31, 41, 58, 76, 84, 73, 18, 92, 7, 47, 36, 74, 5, 48, 98, 19, 15, 46, 75, 14, 16, 17, 96.

Teaching observations

It's easy to make your own binary search tree, but just be sure to start at the root each time you add a new value, which will help to ensure that everything down a left branch is smaller than the node it's branching from. If something is put down the wrong branch then it will throw off the searching! If you're not sure, just use the examples we've given. If you run out of room, the tree doesn't have to be too tidy, as long as it's obvious at each node which way the left and right branches are (especially if there's only one branch, the direction needs to be obvious, although it could bend around after leaving the node).

An untidy, but readable binary search tree.

Use a binary search tree to look up information

In practice, we'd usually have some information associated with the value that we're looking for (such as a person's details). You could build a binary search tree that looks up the population of cities, dates of historical events, or the definition of foreign words (an example is shown below). Having students research the content of a disc and then place it, provides an integrated learning opportunity.

A binary search tree where the nodes are cities with their population number, sorted in alphabetical order.

Teaching observations

When creating the nodes for your own tree, write the key (the value that is being searched for) above any other corresponding information for that node. For example, if you will be searching for cities via alphabetical order, write the city name above the population. If searching by population write the population above the city name.

How fast is a binary search tree?

Here is a large binary search tree. How many nodes (discs) does it have? How many numbers would you compare getting from the root to the far end of the tree (which is called a leaf)? (There are 63 nodes, but the maximum number of discs you'll look at is 6 - that's pretty efficient!) What if there was another layer in the tree? (There would be 127 nodes, but you only need to look at 7 of them). In fact, every time the number of items being searched doubles, it only takes a little more time to search it.

When binary search trees go bad

Have students create a new tree where you give them numbers in ascending order (e.g. 2, 12, 21, 23, 41, 45...), instead of random order. For example, below is a tree where we've added the numbers 5, 10, 15 and so on; as a result, every branch is a right branch. It works, but if we keep getting new numbers to add in increasing order, they will form a long chain of right branches, and it is slow to find things in it. As students add numbers like 50, 55, 60, and so on, they will see that it gets to be a long way from the root to a leaf. Luckily it’s usual for data to arrive in a random order, and you don't get these patterns. The paradox about binary search trees is that they perform the worst when you give them ordered information, and are much better when it's random! In practice this situation can be avoided easily using some simple changes to the algorithm for adding nodes.

A binary search tree with only right branches.

How does this relate to binary search?

The binary search tree (BST) is a lot like a binary search, but because it uses a tree, it's much easier to add an item to it, because you just add it to the end of the tree as a new leaf. In contrast, to use binary search you need to have a sorted list, and if you want to add something to a sorted list you can’t just add it to the end - you have to move all the data in that list along one space, so that there is space in the list for the new piece of data. If a computer program is adding something to a sorted list, this means it has to take everything else in the list and move it to a new place in its memory, and this can take some time! This is why computer scientists choose to use trees in some situations, and sorted lists in others.

Mathematical links

This activity relies heavily on comparing values; making one mistake will send students down the wrong branch of the tree! It can be made more challenging by using larger numbers (e.g. 5 digit numbers such as 32843 and 32480, which look similar, and require some care to compare precisely).

Applying what we have just learnt

A binary search tree is an elegant way to do searching. It has a lot in common with doing a binary search of a sorted list, but the two shouldn't be confused. In particular, a tree makes it very easy to add new values - you just link it to the last node you got to when searching for the value. In the sorted list for a binary search, you have to move everything up to make space for a new value. So in practice, binary search trees (and related structures) are more likely to be used than binary search, since data tends to be dynamic, with new values being added frequently.

Lesson reflection

What surprises did you have with this lesson? Did you see how your knowledge of a binary search supported your learning about binary search trees?

Potential answers could include:

Being surprised that binary trees work best with random numbers.
Like binary search, there is a maximum of two options from each branch and the algorithm for deciding which way to go is the same for every branch.

Seeing the Computational Thinking connections

Throughout the lessons there are links to computational thinking. Below we've noted some general links that apply to this content.

Teaching computational thinking through CSUnplugged activities supports students to learn how to describe a problem, identify what are the important details they need to solve this problem, break it down into small logical steps so that they can then create a process which solves the problem, and then evaluate this process. These skills are transferable to any other curriculum area, but are particularly relevant to developing digital systems and solving problems using the capabilities of computers.

These Computational Thinking concepts are all connected to each other and support each other, but it’s important to note that not all aspects of Computational Thinking happen in every unit or lesson. We’ve highlighted the important connections for you to observe your students in action. For more background information on what our definition of Computational Thinking is see our notes about computational thinking.

Algorithmic thinking

As students find their way around a Binary Search Tree they will be developing very precise algorithms to navigate it. In this activity the main rule they are given is that items to the left are smaller, and items to the right are larger. If they can successfully traverse the tree, they will have operationalised this rule as an algorithm equivalent to “compare the key with the value at the current node, and if it is smaller, move to the node on the left, otherwise move to the node on the right; keep doing this until you find the key or reach the end of the tree.” They may also discover algorithms for finding the smallest value in the whole tree (“keep taking the left branch until the node doesn’t have one”). Many other algorithms are possible for a binary search tree, including finding all the values within a particular range, displaying the values in increasing order, or deleting a value from the tree, although these aren’t covered in this activity.

Examples of what you could look for:

Are students able to confidently find a given value in a tree that they haven’t seen before by only looking at nodes on the path from the root to the node with the value?

Are they able to add a node for a new value in the right place in the tree?

Can they give an algorithm to find the smallest or largest value in the tree without looking at any of the values at all?

Abstraction

Binary search trees are a type of data structure, and so can be used with any type of data, as long as that data can be put into an order. When we use a binary search tree its structure also allows us to ignore much of the information it contains, and focus only on the information that we need at the time. This is because when we use it we only ever have to look at one node at a time, and have no need to look at any others in the tree, until we move on to another.

Examples of what you could look for:

Did students recognise that a binary search tree can also be used to search for words, since we can put them into alphabetical order and compare them?

Did they come up with any other types of data they could put into a binary search tree, such as dates?

Decomposition

Binary search trees are another example of how we can apply the divide and conquer strategy to simplify a problem, and create an efficient solution. Each decision made going through the tree eliminates everything down one branch, reducing the size of the problem.

Examples of what you could look for:

Did students recognise that each decision eliminated an entire section of the tree?

Did they focus on each step, going from one node to the next, individually?

Generalising and patterns

Students who have previously used sorting networks will probably notice many similar patterns between these activities (but it is important to note that these are still very different things!) As with a sorting network, at each node a comparison is made between two numbers, and the result of this comparison tells you where to go next. The interesting thing to observe here is that using just one very simple comparison operation (in different ways), we are able to both sort data into order, and search through it. Another connection between these two activities is the type of data that can be used - the data must have something called a "transitive relation". This is further explained under logic, and also in the Sorting networks topic under logic. Because it is based on a simple comparison step, any type of data that can be sorted using a sorting network can also be placed into a binary search tree. As binary search trees are abstract data structures, they can be generalised for use with any type of data that can be put into some kind of order. The other connection students will likely make is between binary search trees, and binary search! While the names are almost the same, the actual pattern they both follow is subtly different, although they both make use of the divide and conquer strategy. A curious observation about binary search trees is that if values are added in increasing order, they don't work very well! The order that data is added into the tree has an impact on how efficient the tree is, because the tree will form different shapes.

Examples of what you could look for:

Do students recognise that the order we add data into a tree can have an impact on how efficiently the tree performs?

Do students recognise that binary search trees work best if the values are added in a random order i.e. that patterns in the data arriving can make a tree work inefficiently?

Did students make any connections between a binary search tree, and other concepts they have seen before?

Evaluation

As we have seen, because of the way the algorithm for constructing a binary search tree works, trees can sometimes form long chains rather than an evenly distributed tree. This can have a big impact on the efficiency of the tree. As discussed under Decomposition, each time we make a decision at a node we effectively eliminate everything down one branch of the tree. If a tree contains a lot of nodes that only connect to one other node (so it has long chains in it) then each time we make a decision at a node we eliminate none, or very few, nodes. When constructing binary search trees we should examine their structure and evaluate them for efficiency. A well constructed, or balanced, binary search tree will be very efficient to use. The depth of the tree will also be relatively low compared with the number of nodes in it, as the number of nodes that can be at each layer of the tree doubles with each new layer. This doubling is also probably a pattern students have seen before if they have covered the binary number lessons. Evaluating a problem, before creating a tree, is also important because it may not always be the best strategy to use a binary search tree. If we know we already have all of our data, and so won't need to add in more (or almost never will) then using a sorted list and binary search is likely to be more efficient than creating a binary tree. However if we need to add more data in the future then a binary search tree is a much better solution, as it is much easier to add data to a tree than it is to add it to a sorted list.

Examples of what you could look for:

Can students identify what makes an efficient binary search tree and what makes an inefficient one?

Can they compare the pros and cons of using sorted lists and trees in different situations?

Did students observe that the number of nodes that can be at each layer of the tree doubles with each step?

Logic

These trees rely on the logic that if we are following one branch, then one can safely eliminate everything down the other branch. This is because of the simple relationship between the nodes in the tree, and the nodes that they connect to. As with the data we could use in a sorting network, any data can be put into a binary search tree as long as it has what is called a "transitive relation". For example, numbers have a transitive relation based on "less than": the number 5 is less than 10, and 10 is less than 15, which means that 5 must also be less than 15. In general, this transitive relation means: if 'a' is less than 'b', and 'b' is less than 'c', then 'a' is less than 'c'. If items don’t have this relation then there is no logical way for us to figure out where to put it in the tree! Not all relations are transitive; for example, consider the relation "is standing next to". If Arnold is standing next to Tim, and Tim is standing next to Caitlin, it doesn't necessarily mean that Arnold is standing next to Caitlin.

Examples of what you could look for:

Did students recognise that everything down the left branch from a node must be smaller than the value at that node?

Can students reason about where the smallest value in the tree would be? (You keep taking left branches until there are no more to take; the logic is that anything down a right branch must be larger.)

Can they think of other types of data they could put into a binary search tree?

Can they think of any types of data that don’t have a transitive relation, and that therefore can’t be put into a binary search tree?

Binary search trees

Table of contents

Preparatory knowledge

Key questions

Potential answers could include:

Lesson starter

Searching the tree

Finding another number

What if the number isn't there?

Variations

Adding a value

Build your own tree

Build your own tree with other data

Use a much larger tree

Use a binary search tree to look up information

How fast is a binary search tree?

When binary search trees go bad

How does this relate to binary search?

Applying what we have just learnt

Lesson reflection

Potential answers could include:

Seeing the Computational Thinking connections

Examples of what you could look for:

Examples of what you could look for:

Examples of what you could look for:

Examples of what you could look for:

Examples of what you could look for:

Examples of what you could look for: