More Operations on Sorted Sets
Authors: Darren Yao, Benjamin Qi, Andrew Wang
Contributors: Aadit Ambadkar, Jeffrey Hu, Jason Sun
Finding the next element smaller or larger than a specified key in a set, and using iterators with sets.
C++
Resources | ||||
---|---|---|---|---|
IUSACO | module is based off this | |||
CP2 | see decription of BSTs and heaps |
Java
Resources | ||||
---|---|---|---|---|
IUSACO | module is based off this | |||
CP2 | see decription of BSTs and heaps |
In sets and maps where keys (or elements) are stored in sorted order, accessing
or removing the next key higher or lower than some input key k
is supported.
Keep in mind that insertion and deletion will take time for sorted sets, which is more than the average insertion and deletion for unordered sets, but less than the worst case insertion and deletion for unordered sets.
Using Iterators
In Bronze, we avoided discussion of any set operations involving iterators.
C++
Resources | ||||
---|---|---|---|---|
CPH |
Java
In Java, iterators are helpful for looping through sets.
Iterators used with HashSet
would yield the elements in random order:
Set<Integer> set = new HashSet<Integer>();set.add(1);set.add(3);set.add(0);set.add(-2);Iterator it = set.iterator();while (it.hasNext()) {Integer i = (Integer)it.next();System.out.print(i + " "); // returns some random order}
But with TreeSet
the elements are in sorted order:
Set<Integer> set = new TreeSet<Integer>();set.add(1);set.add(3);set.add(0);set.add(-2);Iterator it = set.iterator();while (it.hasNext()) {Integer i = (Integer)it.next();System.out.print(i + " "); // returns -2 0 1 3}
Instead of creating an iterator and looping with it like in C++, Java provides a for-each loop which creates a hidden iterator and loops with it automatically:
Set<Integer> set = new TreeSet<Integer>();set.add(1);set.add(3);set.add(0);set.add(-2);for (int i : set) {System.out.print(i + " "); // returns -2 0 1 3}
Warning!
You shouldn't modify sets when traversing it with set iterators like in any
other iterators for Collections
(this INCLUDES when using a for-each
loop). The only modification possible is using the iterator remove()
method
which can only be used once before calling the next()
method.
Python
In Python, we can use iter()
to obtain the iterator object of any iterable.
Using next()
on the iterator lets you iterate through the iterable.
nums = {1, "three", 0, (1, 3, 5), 4}iterator = iter(nums)print(next(iterator)) # 0print(next(iterator)) # 1print(next(iterator)) # 4print(next(iterator)) # threeprint(next(iterator)) # (1, 3, 5)# Note:# The numbers appear ordered when iterating due to the hashing implementation# Sets are not ordered, so do not expect this behavior
Python's iterators are fundamental to iteration and are used in for loops and tuple unpacking, for example. This is useful when you want lower level control over your iteration.
In competitive programming however, you will find that the itertools
library
provides more common uses that can save you lots of time. The most useful of
these are shown below.
import itertools# Iterate through and accumulate with a specified functionnums = [1, 0, 5, 2, -4, 3]psa = list(itertools.accumulate(nums)) # operator.add is the defaultprint(psa) # [1, 1, 6, 8, 4, 7]print(list(itertools.accumulate(nums, min))) # [1, 0, 0, 0, -4, -4]print(list(itertools.accumulate(nums, max))) # [1, 1, 5, 5, 5, 5]# Iterate through the perms and combs below
Sorted Sets
C++
The sorted std::set
also supports:
lower_bound
: returns an iterator to the least element greater than or equal to some elementk
.upper_bound
: returns an iterator to the least element strictly greater than some elementk
.
set<int> s;s.insert(1); // [1]s.insert(14); // [1, 14]s.insert(9); // [1, 9, 14]s.insert(2); // [1, 2, 9, 14]cout << *s.upper_bound(7) << '\n'; // 9cout << *s.upper_bound(9) << '\n'; // 14cout << *s.lower_bound(5) << '\n'; // 9cout << *s.lower_bound(9) << '\n'; // 9cout << *s.begin() << '\n'; // 1auto it = s.end();cout << *(--it) << '\n'; // 14s.erase(s.upper_bound(6)); // [1, 2, 14]
Warning!
Suppose that we replace s.upper_bound(7)
with
upper_bound(begin(s),end(s),7)
, which was the syntax that we used for vectors
in the prerequisite module. This will still output the expected results, but its
time complexity is linear in the size of the set s
rather than logarithmic, so
make sure to avoid it!
Java
TreeSet
s in Java allow for a multitude of additional operations:
first()
: returns the lowest element in the setlast()
: returns the greatest element in the setlower(E v)
: returns the greatest element strictly less thanv
floor(E v)
: returns the greatest element less than or equal tov
higher(E v)
: returns the least element strictly greater thanv
ceiling(E v)
: returns the least element greater than or equal tov
TreeSet<Integer> set = new TreeSet<Integer>();set.add(1); // [1]set.add(14); // [1, 14]set.add(9); // [1, 9, 14]set.add(2); // [1, 2, 9, 14]System.out.println(set.higher(7)); // 9System.out.println(set.higher(9)); // 14System.out.println(set.lower(5)); // 2System.out.println(set.first()); // 1System.out.println(set.last()); // 14set.remove(set.higher(6)); // [1, 2, 14]System.out.println(set.higher(23)); // ERROR, no such element exists
Python
Python does not have a sorted set, so see the C++ or Java if you want an implementation from the standard library.
Creating a sorted set from scratch is time consuming, and not reccomended to do so in contests, but if you are curious regarding the implementation, you can find an AVL tree implementation of a sorted set below.
Resources | ||||
---|---|---|---|---|
DSA Python | definition and implementation of AVL Trees in Python |
Warning!
You are not expected to know how to create an AVL tree in USACO.
Since some online judges such as LeetCode include additional libraries, there is
also an implementation of sorted sets included from the sortedcontainers
library. All operations shown below are time, except for
its initialization.
from sortedcontainers import SortedSetsorted_set = SortedSet([5, 1, 3, 2])print(sorted_set) # SortedSet([1, 2, 3, 4, 7])# Add elementssorted_set.add(4)sorted_set.add(6)print(sorted_set) # SortedSet([1, 2, 3, 4, 5, 6])
Warning!
The sortedcontainers
library is not part of Python's standard library. This
means that most online judges do not provide it (including USACO).
One limitation of sorted sets is that we can't efficiently access the largest element in the set, or find the number of elements in the set greater than some arbitrary . In C++, these operations can be handled using a data structure called an order statistic tree.
Sorted Maps
C++
The ordered map
also allows:
lower_bound
: returns the iterator pointing to the lowest entry not less than the specified keyupper_bound
: returns the iterator pointing to the lowest entry strictly greater than the specified key respectively.
map<int, int> m;m[3] = 5; // [(3, 5)]m[11] = 4; // [(3, 5); (11, 4)]m[10] = 491; // [(3, 5); (10, 491); (11, 4)]cout << m.lower_bound(10)->first << " " << m.lower_bound(10)->second << '\n'; // 10 491cout << m.upper_bound(10)->first << " " << m.upper_bound(10)->second << '\n'; // 11 4m.erase(11); // [(3, 5); (10, 491)]if (m.upper_bound(10) == m.end()) {cout << "end" << endl; // Prints end}
Java
The ordered map additionally supports firstKey
/ firstEntry
and lastKey
/
lastEntry
, returning the lowest key/entry and the highest key/entry, as well
as higherKey
/ higherEntry
and lowerKey
/ lowerEntry
, returning the
lowest key/entry strictly higher than the specified key, or the highest
key/entry strictly lower than the specified key.
TreeMap<Integer, Integer> map = new TreeMap<Integer, Integer>();map.put(3, 5); // [(3, 5)]map.put(11, 4); // [(3, 5); (11, 4)]map.put(10, 491); // [(3, 5); (10, 491); (11, 4)]System.out.println(map.firstKey()); // 3System.out.println(map.firstEntry()); // (3, 5)System.out.println(map.lastEntry()); // (11, 4)System.out.println(map.higherEntry(4)); // (10, 491)map.remove(11); // [(3, 5); (10, 491)]System.out.println(map.lowerKey(4)); // 3System.out.println(map.lowerKey(3)); // ERROR
Python
Python does not have built in sorted maps or sorted sets in it's standard library. However, sorted maps in Python can be created by adding a dictionary to a sorted set, where each element in the sorted set is a key in the dictionary and values can be assigned with the dictionary.
This is the most straightforward implementation, and the ground up implementation of a sorted set can be found in the above section.
Since some online judges include additional libraries, an implementation of
SortedDict
from the sortedcontainers
library can be found below. All
operations shown below are time, except for its
initialization and getting all items, keys, or values
which is time.
from sortedcontainers import SortedDictsorted_map = SortedDict({1: "one", 4: "four", 3: "three"})print(sorted_map) # SortedDict({1: 'one', 3: 'three', 4: 'four'})# Add elementssorted_map[2] = "two"sorted_map[5] = "five"# Output SortedDict({1: 'one', 2: 'two', 3: 'three', 4: 'four', 5: 'five'})
Multisets
A multiset is a sorted set that allows multiple copies of the same element.
C++
In addition to all of the regular set operations,
- the
count()
method returns the number of times an element is present in the multiset. However, this method takes time linear in the number of matches so you shouldn't use it in a contest. - To remove a value once, use
ms.erase(ms.find(val))
. - To remove all occurrences of a value, use
ms.erase(val)
.
Warning!
Using ms.erase(val)
erases all instances of val
from the multiset. To remove one instance of val
, use ms.erase(ms.find(val))
!
multiset<int> ms;ms.insert(1); // [1]ms.insert(14); // [1, 14]ms.insert(9); // [1, 9, 14]ms.insert(2); // [1, 2, 9, 14]ms.insert(9); // [1, 2, 9, 9, 14]ms.insert(9); // [1, 2, 9, 9, 9, 14]cout << ms.count(4) << '\n'; // 0cout << ms.count(9) << '\n'; // 3cout << ms.count(14) << '\n'; // 1ms.erase(ms.find(9));cout << ms.count(9) << '\n'; // 2ms.erase(9);cout << ms.count(9) << '\n'; // 0
Java
While there is no multiset in Java, we can implement one using the TreeMap
from values to their respective frequencies. We declare the TreeMap
implementation globally so that we can write functions for adding and removing
elements from it. The first
, last
, higher
, and lower
operations still
function as intended; just use firstKey
, lastKey
, higherKey
, and
lowerKey
respectively.
static TreeMap<Integer, Integer> multiset = new TreeMap<Integer, Integer>();public static void main(String[] args) { ... }static void add(int x) {if (multiset.containsKey(x)) {multiset.put(x, multiset.get(x) + 1);} else {multiset.put(x, 1);}
Priority Queues
Warning!
Priority queues are not implemented in the same way as sets and multisets, but they are included in this section because the operations that they perform can also be performed with sets.
Resources | ||||
---|---|---|---|---|
CSA |
A priority queue (or heap) supports the following operations: insertion of elements, deletion of the element considered highest priority, and retrieval of the highest priority element, all in time according to the number of elements in the priority queue. Priority queues are simpler and faster than sets, so you should use them instead whenever possible.
C++
C++
priority_queue<int> pq;pq.push(7); // [7]pq.push(2); // [2, 7]pq.push(1); // [1, 2, 7]pq.push(5); // [1, 2, 5, 7]cout << pq.top() << endl; // 7pq.pop(); // [1, 2, 5]pq.pop(); // [1, 2]pq.push(6); // [1, 2, 6]
Java
Java
In Java (unlike in C++), we delete and retrieve the lowest element.
PriorityQueue<Integer> pq = new PriorityQueue<Integer>();pq.add(7); // [7]pq.add(2); // [7, 2]pq.add(1); // [7, 2, 1]pq.add(5); // [7, 5, 2, 1]System.out.println(pq.peek()); // 1pq.poll(); // [7, 5, 2]pq.poll(); // [7, 5]pq.add(6); // [7, 6, 5]
Python
Python
In Python (unlike in C++), we delete and retrieve the lowest element.
Note that Python's priority queue is not encapsulated; heapq
operates on a
provided list directly by turning it into a heap, then doing operations on the
heap.
Warning!
Because of a heap's structure, printing out pq
will not print out the
elements in sorted order in Python; instead, it will print out the list. The
comments below are a representation of what the heap contains, not what
the contents of pq
actually are.
import heapqpq = []"""The following line is not necessary since pq starts as an empty list. However,for lists of length greater than 1, heapify is required to turn the list into aheap."""heapq.heapify(pq)
Introductory Problems
Status | Source | Problem Name | Difficulty | Tags | |
---|---|---|---|---|---|
CSES | Easy | Show TagsSorted Set | |||
CF | Easy | Show TagsSorted Set | |||
CSES | Normal | Show TagsSorted Set | |||
CSES | Normal | Show TagsSorted Set |
Harder Example - Bit Inversions
Warning!
Problems marked as "Hard" or beyond in this module would likely be too difficult to appear on a USACO Silver contest.
Focus Problem – try your best to solve this problem before continuing!
View Internal SolutionSolution
We'll use iterators extensively.
Let the bit string be . In the set dif
, we store
all indices such that (including and ). If the
elements of dif
are , then the longest length is
equal to
We can store each of these differences in a multiset ret
; after each
inversion, we'll need to output the maximum element of ret
.
Inverting a bit at a 0-indexed position x
corresponds to inserting x
into
dif
if it not currently present or removing x
if it is, and then doing the
same with x+1
. Whenever we insert or remove an element of dif
, we should
update ret
accordingly.
C++
#include <bits/stdc++.h>using namespace std;#define sz(x) (x).size()string s;int m;set<int> dif;multiset<int> ret;
Java
Warning!
Java solutions are too slow for the CSES. Use C++ instead to get AC.
import java.io.*;import java.util.*;class BitInversion {public static TreeMap<Integer, Integer> ret = new TreeMap<Integer, Integer>();public static String s;public static int m;public static Set<Integer> dif = new TreeSet<Integer>();public static void modify(int x) {
Note that multiset has a high constant factor, so replacing ret
with a
priority queue and an array that stores the number of times each integer occurs
in the priority queue reduces the runtime by a factor of 2.
C++
#include <bits/stdc++.h>using namespace std;#define sz(x) (int)(x).size()string s;int m;set<int> dif;priority_queue<int> ret;int cnt[200005];
Java
import java.io.*;import java.util.*;class BitInversion {public static PriorityQueue<Integer> pq =new PriorityQueue<Integer>(Collections.reverseOrder());public static String s;public static int m;public static TreeSet<Integer> dif = new TreeSet<Integer>();public static int cnt[];
Harder Problems
Status | Source | Problem Name | Difficulty | Tags | |
---|---|---|---|---|---|
Silver | Normal | Show TagsSorted Set | |||
Silver | Normal | Show TagsSorted Set | |||
CF | Normal | Show TagsGreedy, Sorted Set, Sorting | |||
CF | Normal | Show TagsGreedy, Multiset, Sorting | |||
CF | Normal | Show TagsCoordinate Compression, Prefix Sums, Sorted Set, Sorting | |||
Gold | Hard | Show TagsLinked List, Sorted Set | |||
AC | Hard | Show TagsGreedy, Sorted Set | |||
CF | Hard | Show TagsBinary Search, Sorted Set | |||
CF | Very Hard | Show TagsSorted Set | |||
CF | Insane | Show TagsSorted Set |
Module Progress:
Join the USACO Forum!
Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!