mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-31 02:02:21 +06:00
[testing utils] get_auto_remove_tmp_dir more intuitive behavior (#8401)
* [testing utils] get_auto_remove_tmp_dir default change Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal. 99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug. * simplify things * update docs * fix doc layout * style * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * better 3-state doc * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * s/tmp/temporary/ + style * correct the statement Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
parent
e7e1549895
commit
e21340da7a
@ -716,11 +716,11 @@ Temporary files and directories
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Using unique temporary files and directories are essential for parallel test running, so that the tests won't overwrite
|
||||
each other's data. Also we want to get the temp files and directories removed at the end of each test that created
|
||||
each other's data. Also we want to get the temporary files and directories removed at the end of each test that created
|
||||
them. Therefore, using packages like ``tempfile``, which address these needs is essential.
|
||||
|
||||
However, when debugging tests, you need to be able to see what goes into the temp file or directory and you want to
|
||||
know it's exact path and not having it randomized on every test re-run.
|
||||
However, when debugging tests, you need to be able to see what goes into the temporary file or directory and you want
|
||||
to know it's exact path and not having it randomized on every test re-run.
|
||||
|
||||
A helper class :obj:`transformers.test_utils.TestCasePlus` is best used for such purposes. It's a sub-class of
|
||||
:obj:`unittest.TestCase`, so we can easily inherit from it in the test modules.
|
||||
@ -736,32 +736,33 @@ Here is an example of its usage:
|
||||
|
||||
This code creates a unique temporary directory, and sets :obj:`tmp_dir` to its location.
|
||||
|
||||
In this and all the following scenarios the temporary directory will be auto-removed at the end of test, unless
|
||||
``after=False`` is passed to the helper function.
|
||||
|
||||
* Create a temporary directory of my choice and delete it at the end - useful for debugging when you want to monitor a
|
||||
specific directory:
|
||||
* Create a unique temporary dir:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test")
|
||||
tmp_dir = self.get_auto_remove_tmp_dir()
|
||||
|
||||
* Create a temporary directory of my choice and do not delete it at the end---useful for when you want to look at the
|
||||
temp results:
|
||||
``tmp_dir`` will contain the path to the created temporary dir. It will be automatically removed at the end of the
|
||||
test.
|
||||
|
||||
* Create a temporary dir of my choice, ensure it's empty before the test starts and don't empty it after the test.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False)
|
||||
tmp_dir = self.get_auto_remove_tmp_dir("./xxx")
|
||||
|
||||
* Create a temporary directory of my choice and ensure to delete it right away---useful for when you disabled deletion
|
||||
in the previous test run and want to make sure the that temporary directory is empty before the new test is run:
|
||||
This is useful for debug when you want to monitor a specific directory and want to make sure the previous tests didn't
|
||||
leave any data in there.
|
||||
|
||||
.. code-block:: python
|
||||
* You can override the default behavior by directly overriding the ``before`` and ``after`` args, leading to one of the
|
||||
following behaviors:
|
||||
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True)
|
||||
- ``before=True``: the temporary dir will always be cleared at the beginning of the test.
|
||||
- ``before=False``: if the temporary dir already existed, any existing files will remain there.
|
||||
- ``after=True``: the temporary dir will always be deleted at the end of the test.
|
||||
- ``after=False``: the temporary dir will always be left intact at the end of the test.
|
||||
|
||||
.. note::
|
||||
In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are allowed if
|
||||
@ -815,7 +816,7 @@ or the ``xfail`` way:
|
||||
@pytest.mark.xfail
|
||||
def test_feature_x():
|
||||
|
||||
Here is how to skip a test based on some internal check inside the test:
|
||||
- Here is how to skip a test based on some internal check inside the test:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@ -838,7 +839,7 @@ or the ``xfail`` way:
|
||||
def test_feature_x():
|
||||
pytest.xfail("expected to fail until bug XYZ is fixed")
|
||||
|
||||
Here is how to skip all tests in a module if some import is missing:
|
||||
- Here is how to skip all tests in a module if some import is missing:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
@ -522,45 +522,47 @@ class TestCasePlus(unittest.TestCase):
|
||||
- ``repo_root_dir_str``
|
||||
- ``src_dir_str``
|
||||
|
||||
Feature 2: Flexible auto-removable temp dirs which are guaranteed to get removed at the end of test.
|
||||
Feature 2: Flexible auto-removable temporary dirs which are guaranteed to get removed at the end of test.
|
||||
|
||||
In all the following scenarios the temp dir will be auto-removed at the end of test, unless `after=False`.
|
||||
|
||||
# 1. create a unique temp dir, `tmp_dir` will contain the path to the created temp dir
|
||||
1. Create a unique temporary dir:
|
||||
|
||||
::
|
||||
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir()
|
||||
|
||||
# 2. create a temp dir of my choice and delete it at the end - useful for debug when you want to # monitor a
|
||||
specific directory
|
||||
``tmp_dir`` will contain the path to the created temporary dir. It will be automatically removed at the end of the
|
||||
test.
|
||||
|
||||
|
||||
2. Create a temporary dir of my choice, ensure it's empty before the test starts and don't
|
||||
empty it after the test.
|
||||
|
||||
::
|
||||
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test")
|
||||
tmp_dir = self.get_auto_remove_tmp_dir("./xxx")
|
||||
|
||||
# 3. create a temp dir of my choice and do not delete it at the end - useful for when you want # to look at the
|
||||
temp results
|
||||
This is useful for debug when you want to monitor a specific directory and want to make sure the previous tests
|
||||
didn't leave any data in there.
|
||||
|
||||
::
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False)
|
||||
3. You can override the first two options by directly overriding the ``before`` and ``after`` args, leading to the
|
||||
following behavior:
|
||||
|
||||
# 4. create a temp dir of my choice and ensure to delete it right away - useful for when you # disabled deletion in
|
||||
the previous test run and want to make sure the that tmp dir is empty # before the new test is run
|
||||
``before=True``: the temporary dir will always be cleared at the beginning of the test.
|
||||
|
||||
::
|
||||
``before=False``: if the temporary dir already existed, any existing files will remain there.
|
||||
|
||||
def test_whatever(self):
|
||||
tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True)
|
||||
``after=True``: the temporary dir will always be deleted at the end of the test.
|
||||
|
||||
Note 1: In order to run the equivalent of `rm -r` safely, only subdirs of the project repository checkout are
|
||||
allowed if an explicit `tmp_dir` is used, so that by mistake no `/tmp` or similar important part of the filesystem
|
||||
will get nuked. i.e. please always pass paths that start with `./`
|
||||
``after=False``: the temporary dir will always be left intact at the end of the test.
|
||||
|
||||
Note 2: Each test can register multiple temp dirs and they all will get auto-removed, unless requested otherwise.
|
||||
Note 1: In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are
|
||||
allowed if an explicit ``tmp_dir`` is used, so that by mistake no ``/tmp`` or similar important part of the
|
||||
filesystem will get nuked. i.e. please always pass paths that start with ``./``
|
||||
|
||||
Note 2: Each test can register multiple temporary dirs and they all will get auto-removed, unless requested
|
||||
otherwise.
|
||||
|
||||
Feature 3: Get a copy of the ``os.environ`` object that sets up ``PYTHONPATH`` specific to the current test suite.
|
||||
This is useful for invoking external programs from the test suite - e.g. distributed training.
|
||||
@ -573,6 +575,7 @@ class TestCasePlus(unittest.TestCase):
|
||||
"""
|
||||
|
||||
def setUp(self):
|
||||
# get_auto_remove_tmp_dir feature:
|
||||
self.teardown_tmp_dirs = []
|
||||
|
||||
# figure out the resolved paths for repo_root, tests, examples, etc.
|
||||
@ -660,21 +663,42 @@ class TestCasePlus(unittest.TestCase):
|
||||
env["PYTHONPATH"] = ":".join(paths)
|
||||
return env
|
||||
|
||||
def get_auto_remove_tmp_dir(self, tmp_dir=None, after=True, before=False):
|
||||
def get_auto_remove_tmp_dir(self, tmp_dir=None, before=None, after=None):
|
||||
"""
|
||||
Args:
|
||||
tmp_dir (:obj:`string`, `optional`):
|
||||
use this path, if None a unique path will be assigned
|
||||
before (:obj:`bool`, `optional`, defaults to :obj:`False`):
|
||||
if `True` and tmp dir already exists make sure to empty it right away
|
||||
after (:obj:`bool`, `optional`, defaults to :obj:`True`):
|
||||
delete the tmp dir at the end of the test
|
||||
if :obj:`None`:
|
||||
|
||||
- a unique temporary path will be created
|
||||
- sets ``before=True`` if ``before`` is :obj:`None`
|
||||
- sets ``after=True`` if ``after`` is :obj:`None`
|
||||
else:
|
||||
|
||||
- :obj:`tmp_dir` will be created
|
||||
- sets ``before=True`` if ``before`` is :obj:`None`
|
||||
- sets ``after=False`` if ``after`` is :obj:`None`
|
||||
before (:obj:`bool`, `optional`):
|
||||
If :obj:`True` and the :obj:`tmp_dir` already exists, make sure to empty it right away if :obj:`False`
|
||||
and the :obj:`tmp_dir` already exists, any existing files will remain there.
|
||||
after (:obj:`bool`, `optional`):
|
||||
If :obj:`True`, delete the :obj:`tmp_dir` at the end of the test if :obj:`False`, leave the
|
||||
:obj:`tmp_dir` and its contents intact at the end of the test.
|
||||
|
||||
Returns:
|
||||
tmp_dir(:obj:`string`): either the same value as passed via `tmp_dir` or the path to the auto-created tmp
|
||||
tmp_dir(:obj:`string`): either the same value as passed via `tmp_dir` or the path to the auto-selected tmp
|
||||
dir
|
||||
"""
|
||||
if tmp_dir is not None:
|
||||
|
||||
# defining the most likely desired behavior for when a custom path is provided.
|
||||
# this most likely indicates the debug mode where we want an easily locatable dir that:
|
||||
# 1. gets cleared out before the test (if it already exists)
|
||||
# 2. is left intact after the test
|
||||
if before is None:
|
||||
before = True
|
||||
if after is None:
|
||||
after = False
|
||||
|
||||
# using provided path
|
||||
path = Path(tmp_dir).resolve()
|
||||
|
||||
@ -691,6 +715,15 @@ class TestCasePlus(unittest.TestCase):
|
||||
path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
else:
|
||||
# defining the most likely desired behavior for when a unique tmp path is auto generated
|
||||
# (not a debug mode), here we require a unique tmp dir that:
|
||||
# 1. is empty before the test (it will be empty in this situation anyway)
|
||||
# 2. gets fully removed after the test
|
||||
if before is None:
|
||||
before = True
|
||||
if after is None:
|
||||
after = True
|
||||
|
||||
# using unique tmp dir (always empty, regardless of `before`)
|
||||
tmp_dir = tempfile.mkdtemp()
|
||||
|
||||
@ -701,7 +734,8 @@ class TestCasePlus(unittest.TestCase):
|
||||
return tmp_dir
|
||||
|
||||
def tearDown(self):
|
||||
# remove registered temp dirs
|
||||
|
||||
# get_auto_remove_tmp_dir feature: remove registered temp dirs
|
||||
for path in self.teardown_tmp_dirs:
|
||||
shutil.rmtree(path, ignore_errors=True)
|
||||
self.teardown_tmp_dirs = []
|
||||
|
Loading…
Reference in New Issue
Block a user