Issue29 cache loading by GUID #73

weinshec · 2016-10-26T21:23:31Z

Implement cache details loading by GUID according to issue #29

Add method to retrieve a caches GUID
Add method to load all cache details from its print-page (with a few exceptions of properties that are not listed there)
Add id_ argument to Waypoints.from_html() allowing to specify the html-id of the waypoint table to parse

+ Collect all hardcoded urls used in Cache class in dictionary + Add private method to load a caches GUID for latter usage when loading the cache details via `load_by_guid()` + Add unittests for `load_by_guid()` and ensure that `load()` and `load_quick()` are not called

+ Parse all available cache details from its "print-page". + The following list of properties is excluded, since they are not contained in the "print-page": - original_location - state - found - pm_only + Add optional 'id' argument to Waypoints.from_html() classmethod to specify the html of the table to parse.

coveralls · 2016-10-26T21:25:18Z

Coverage increased (+0.4%) to 96.11% when pulling 31f68aa on weinshec:issue29_cache_loading_performance into dcb993f on tomasbedrich:master.

tomasbedrich

BIG thank you for your PR. There are some more changes to do, but you have done a lot of work. 👏

tomasbedrich · 2016-10-31T14:27:33Z

pycaching/cache.py

+
+        self.waypoints = Waypoint.from_html(content, "Waypoints")
+
+    def _load_guid(self):


Can you please drop this method and use load_quick instead of it?

Wouldn't using load_quick spoil the whole point of load_by_guid since most of the attributes are already contained there? I introduced this method, cause I thought it might be easier when you are going to refactor the code.

Should I remove it anyway?

The goal is to save a bandwidth. When I already do an request to self._urls["tiles_server"] (either by calling load_quick() or _load_guid()), I see no reason, why not to parse all the data got in a response. To prevent it to be messy, I wanted you to write a simple comment inside the load_by_guid() method.

You are right, that the loading code is somehow redundant now, but this is intentional – everyone can have a different starting info about a cache (WP or GUID) and can end up with different result, depending on his needs and loading method used.

So please either remove the method anyway / or, if you prefer, you can "wrap" the load_quick() method to the _load_guid() to have it more clear. Somehow like this:

def load_by_guid(self): ... _load_guid() ... def _load_guid(self): """COMMENT WHY""" return load_quick(self)

Taking different starting points (WP or GUID) into account is actually a good point. So I moved GUID loading to load_quick() and removed the redundant _load_guid() then. Now its possible to e.g. set the guid (if already known) in Cache ctor and call load_by_guid() (see according test case). Sorry that I missed that point in the first place :)

tomasbedrich · 2016-10-31T14:28:42Z

pycaching/cache.py

+            msg = res["msg"] if "msg" in res else "Unknown error (probably not existing cache)"
+            raise errors.LoadError("Cache {} cannot be loaded: {}".format(self, msg))
+
+        self.guid = res["data"][0]["g"]


If you just move this line to load_quick, it will have the same effect in the end. 😉

see comment above

tomasbedrich · 2016-10-31T14:31:24Z

pycaching/cache.py

+        :raise .PMOnlyException: If the PM only warning is shown on the page
+        """
+        if not hasattr(self, "guid"):
+            self._load_guid()


Change here – please see the note for line 752 + please add a comment on this line, why using load_quick.

see comment below

tomasbedrich · 2016-10-31T14:33:31Z

pycaching/cache.py

+            self._load_guid()
+        res = self.geocaching._request(self._urls["print_page"],
+                                       params={"guid": self.guid})
+        if res.find("p", {"class": "Warning"}) is not None:


Please be consistent with the rest of the library and use find("p", "Warning") for finding an element by a class. The same note applies multiple times below.

tomasbedrich · 2016-10-31T14:35:49Z

pycaching/cache.py

+            raise errors.PMOnlyException()
+        content = res.find(id="Content")
+
+        self.name = content.find("h2").get_text()


Please be consistent and use .text attribute.

tomasbedrich · 2016-10-31T14:40:05Z

pycaching/cache.py

+        self.location = Point.from_string(
+            content.find("p", {"class": "LatLong Meta"}).text)
+
+        type_img = os.path.basename(content.find("h2").img.get("src"))


Please be consistent and use ["src"] for HTML attribute access. Also please use find("img") for traversing the DOM tree.

BTW, good point with the os.path.basename! Can you please create a separate PR for cleaning up <img src="..."> parsing like this?

Actually the src attribute of img tags seems to be always accessed via get("src") all over the code. Do you want me to change them all (even outside load_by_guid)?

Good point! So there is an inconsistency in accessing the HTML attributes, because my previsous comment was based on this line.

I would be glad, if you could pick one way (whatever you prefer) of handling attributes and unite it all over the code – maybe in a separate PR to keep this focused.

tomasbedrich · 2016-10-31T14:48:32Z

pycaching/cache.py

+            "p", text=re.compile("Placed by:")).text.split("\r\n")[2].strip()
+
+        self.hidden = content.find(
+            "p", text=re.compile("Placed Date:")).text.split(": ")[1]


Please make this a little bit more solid. This can easily break, if some user sets a date format containing :.

tomasbedrich · 2016-10-31T14:50:28Z

pycaching/cache.py

+
+        attr_img = content.find_all("img", src=re.compile("\/attributes\/"))
+        attributes_raw = [
+            _.get("src").split("/")[-1].rsplit("-", 1) for _ in attr_img


You can use os.path.basepath like you already did. 😉

tomasbedrich · 2016-10-31T14:53:47Z

pycaching/cache.py

+        self.description = content.find(
+            "h2", text="Long Description").find_next("div").text
+
+        self.hint = rot13(content.find(id="uxDecryptedHint").text)


You can use uxEncryptedHint without rot13.

tomasbedrich · 2016-10-31T15:01:15Z

pycaching/cache.py

@@ -831,10 +918,14 @@ def __init__(self, id=None, type=None, location=None, note=None):
        self._note = note

    @classmethod
-    def from_html(cls, root):
-        """Return a dictionary of all waypoints found in the page representation"""
+    def from_html(cls, root, id_="ctl00_ContentBody_Waypoints"):


Please remove the default value of id_ and propagate it to the method call (line 650). Also maybe rename the variables root and id somehow more meaningful...

coveralls · 2016-11-09T16:42:56Z

Coverage increased (+0.1%) to 95.844% when pulling 185cd38 on weinshec:issue29_cache_loading_performance into dcb993f on tomasbedrich:master.

+ GUID is now set by load_quick(), since the _load_guid() and load_quick() requested the same page anyway + Add property getter/setter for GUID with regex validator

coveralls · 2016-12-11T23:38:33Z

Coverage increased (+0.05%) to 95.763% when pulling d5edee2 on weinshec:issue29_cache_loading_performance into dcb993f on tomasbedrich:master.

tomasbedrich

The last change before merging - just the comment. ;)

tomasbedrich · 2016-12-12T09:48:33Z

pycaching/cache.py

+        return self._guid
+
+    @guid.setter
+    def guid(self, guid):


👍 for validating a guid

tomasbedrich · 2016-12-12T09:51:04Z

pycaching/cache.py

@@ -697,7 +714,7 @@ def load_by_guid(self):
        :raise .PMOnlyException: If the PM only warning is shown on the page
        """
        if not hasattr(self, "guid"):
-            self._load_guid()
+            self.load_quick()


Please add a brief comment why.

+ `Cache.guid` now return `None` in case it has not yet been set + Add test that checks `load_quick()` is called in case guid is `None` + Add comment to `load_by_guid` when calling `load_quick`

coveralls · 2016-12-12T13:06:14Z

Coverage increased (+0.1%) to 95.852% when pulling d922e3f on weinshec:issue29_cache_loading_performance into dcb993f on tomasbedrich:master.

tomasbedrich

Thank you. :)

Christoph Weinsheimer added 2 commits October 1, 2016 12:11

tomasbedrich requested changes Oct 31, 2016

View reviewed changes

Christoph Weinsheimer added 7 commits October 31, 2016 18:02

Refactor Waypoint.from_html arguments

35ceeb2

Use correct table to avoid rot13 of encrypted hint

5bb6eab

Refactor attribute img tag src parsing

b08d64f

Make parsing hidden date more robust

7b652ff

Refactor finding cache type img at load_by_guid

aa9dbad

Streamline attribute access finding cache name

2ee54e2

Streamline finding HTML tags by class name

185cd38

Remove _load_guid() and load GUID by load_quick()

d5edee2

+ GUID is now set by load_quick(), since the _load_guid() and load_quick() requested the same page anyway + Add property getter/setter for GUID with regex validator

tomasbedrich requested changes Dec 12, 2016

View reviewed changes

Add comment to load_by_guid() and fix guid getter

d922e3f

+ `Cache.guid` now return `None` in case it has not yet been set + Add test that checks `load_quick()` is called in case guid is `None` + Add comment to `load_by_guid` when calling `load_quick`

tomasbedrich approved these changes Dec 12, 2016

View reviewed changes

tomasbedrich merged commit 3d3504a into tomasbedrich:master Dec 12, 2016

weinshec deleted the issue29_cache_loading_performance branch December 13, 2016 12:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue29 cache loading by GUID #73

Issue29 cache loading by GUID #73

weinshec commented Oct 26, 2016

coveralls commented Oct 26, 2016 •

edited

Loading

tomasbedrich left a comment

tomasbedrich Oct 31, 2016

weinshec Nov 9, 2016

tomasbedrich Nov 10, 2016

weinshec Dec 11, 2016 •

edited

Loading

tomasbedrich Oct 31, 2016

weinshec Nov 9, 2016

tomasbedrich Oct 31, 2016

weinshec Nov 9, 2016

tomasbedrich Oct 31, 2016

tomasbedrich Oct 31, 2016

tomasbedrich Oct 31, 2016

weinshec Nov 9, 2016

tomasbedrich Nov 10, 2016

tomasbedrich Oct 31, 2016

tomasbedrich Oct 31, 2016

tomasbedrich Oct 31, 2016

tomasbedrich Oct 31, 2016

coveralls commented Nov 9, 2016 •

edited

Loading

coveralls commented Dec 11, 2016 •

edited

Loading

tomasbedrich left a comment

tomasbedrich Dec 12, 2016

tomasbedrich Dec 12, 2016

weinshec Dec 12, 2016

coveralls commented Dec 12, 2016 •

edited

Loading

tomasbedrich left a comment


		self.waypoints = Waypoint.from_html(content, "Waypoints")

		def _load_guid(self):

Issue29 cache loading by GUID #73

Issue29 cache loading by GUID #73

Conversation

weinshec commented Oct 26, 2016

coveralls commented Oct 26, 2016 • edited Loading

tomasbedrich left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weinshec Dec 11, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 9, 2016 • edited Loading

coveralls commented Dec 11, 2016 • edited Loading

tomasbedrich left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Dec 12, 2016 • edited Loading

tomasbedrich left a comment

Choose a reason for hiding this comment

coveralls commented Oct 26, 2016 •

edited

Loading

weinshec Dec 11, 2016 •

edited

Loading

coveralls commented Nov 9, 2016 •

edited

Loading

coveralls commented Dec 11, 2016 •

edited

Loading

coveralls commented Dec 12, 2016 •

edited

Loading