1
0

Solution.

This commit is contained in:
Miguel Salgado 2022-08-02 13:44:53 -07:00
commit 6d55d3307e
2 changed files with 122 additions and 0 deletions

27
README.md Normal file
View File

@ -0,0 +1,27 @@
Solution for a code challenge:
- The challenge is to create a program that computes some basic statistics on a collection of small positive integers. You can assume all values will be less than 1000.
- The `DataCapture` object accepts numbers and returns an object for querying statistics about the inputs. Specifically, the returned object supports querying how many numbers in the collection are less than a value, greater than a value, or within a range.
# Executing
The solution is implemented in a single file, and it's targeting `python39`.
To execute the example you can just run `python app.py` assuming you are using the correct version, and
that you are located in the directory for this project.
# About the solution
The problem can be divided in two parts,
1. An object capable of _gathering_ an arbitrary amount of numbers, (assumed to be positive, and less than 1001).
2. Generate an object that can query in constant time, counts numbers in certain ranges.
So, with the first object we construct a _frequency function_ that will tell us how many repetition of a given
number we have.
Then we define a function that would be the _integral_ of the frequency function, such that whenever we _evaluate it_
we'll get how many numbers exist that are less than _it_.
When we say function, we are not talking about a function in programming terms, we are talking about a discrete function as a _rule of correspondence_, and internally in this code it's represented a list where the nth element of the
list, is the evaluation for the number (n-1).

95
app.py Normal file
View File

@ -0,0 +1,95 @@
#!/usr/bin/env python3
class Stats:
def __init__(self, data):
"""
Here we create a _table_ (dictionary), with the
frequency for the numbers contained in data,
which should take a linear amount of time in the
input length, then we execute a loop of constant
amount of iteration, which adds 1002 steps.
But it's constant.
So overall the __init__ method, should take
a linear amount of time, in the data length.
"""
_data = {}
for v in data:
_data[v] = _data.get(v, 0) + 1
_list, prev = [], 0
for i in range(0, 1002):
new = _data.get(i, 0) + prev
_list.append(new)
prev = new
self._list = _list
def less(self, value):
"""
This method executes in constant time, because
it's a list access by element.
"""
returnable = self._list[value - 1]
# print(value, returnable)
return returnable
def between(self, m, M):
"""
This method executes in constant time, because
it's consuming two times a constant method,
plus an arithmetic operation.
"""
return self.less(M + 1) - self.less(m - 1)
def greater(self, value):
"""
This method executes in constant time, because
it's consuming two times a constant method,
plus an arithmetic operation.
"""
return self.less(1001) - self.less(value + 1)
class DataCapture:
def __init__(self, *args, **kwargs):
self._values = []
def add(self, *values):
"""
This method is actually cheating, but the explanation is
the extend method on a list, will the same amount
of time as the count of objects you are adding.
If you intend to consume this method to add
a single element then it's a constant time, but is
you are adding a _list_ of methods (by using the positional
arguments), it will take a linear time in the arguments.
"""
self._values.extend(values)
def build_stats(self):
"""
Here we delegate everything to the Stats
init, the init for the Stats object,
will receive a _sorted list_,
which I think it's a good assumption to start
with.
"""
return Stats(sorted(self._values))
if __name__ == "__main__":
# Example code execution
capture = DataCapture()
capture.add(3)
capture.add(9)
capture.add(3)
capture.add(4)
capture.add(6)
stats = capture.build_stats()
val = stats.less(4)
print("Less 4: (2) \t", val, val == 2)
val = stats.between(3, 6)
print("Between 3, 6: (4)\t", val, val == 4)
val = stats.greater(4)
print("Greater 4: (2) \t", val, val == 2)