Welcome to Fakeme’s documentation!¶
Fakeme Data Generator for Chained and Relative Data
Use under the hood at current time:
- Mimesis (like one of the generators for fields values)
- Pandas (like main instrument to combain data in tables (frames)
- Standart Python Library
How to use¶
pip install fakeme
from fakeme import Fakeme
Fakeme(
# tables - list of tables defenitions that you want to generate
# must be provided table_id and schema in object or as path to schema file
tables=[
('dogs', [{'name': 'name'}, {'name': 'breed'}, {'name': 'id'}]),
('amazing_animals', 'cats', [{'name': 'Id'}, {'name': 'breed'}, {'name': 'Name'}])
]
).run()
To define relationships between tables use ‘rls’ param.
...
Fakeme(
tables=list_of_tables,
dump_schema=True,
params={'row_numbers': 15}, # how much rows we want to generate, default value = 100
# rls stands for relationship - defining relationship between tables,
# that field depend on that
rls={'warehouse': {'part_id': {'alias': 'part_identification',
'matches': 1,
'table': 'parts'}}
}).run()
Full example in:
A lot of different examples available in:
Basic terminology¶
Dataset - taken as parallel to BigQueries Datasets, so it something relative to schema in DB - group of tables. It’s not important right now, but you can group tables this way.
Table_id - similar to table name main identificator fr your table (‘dataframe’ in process of data generation). This name needed to define aliases, links between tables, define aggregations - if they exists and etc. This ‘id’ or ‘name’ will be used as file name for output result.
Column -
Field -
Schema -
Rule -
Generator -
Output -
Dataset can contains multiply Tables
Tables defined by Schema - description for the Columns (that Fields we want to have in table, with what values and what type of Values)