Help with relationship tracking script

So, as I mentioned in the ranting thread, I wanted to make a script to track a person's blood relations, so I can add a line like "He is your great great uncle." when I honours someone. At first, I thought this would be simple. Find the lowest common ancestor of person A and person B, look at the distance from that ancestor for each of them, now you know exactly how person A and person B are related. Easy.

But then I remembered incest, and everything was ruined. One person can have multiple distances to the lowest common ancestor, there can be a bunch of lowest common ancestors, there can be ancestors who aren't the lowest but still matter for determining the relationship, or all of the above. I've been thinking about it for a couple days now, and no matter what I try I can't think of a method/algorithm for determining which common ancestors matter for determining the relationship between two people. Every time I think of something that might work, I can find an example that screws it up.

So, does anyone have any ideas for how to find all the blood relationships between two people, accounting for any amount of entanglement due to incest? I tried searching for family tree programs or anything else that might already do this, but I couldn't find anything useful.

Comments

  • NizarisNizaris The Holy City of Mhaldor
    edited April 2013
    @Sena: I think that you could benefit from treating this problem like the problem of finding a path to a destination: there are several ways to get from point A to point B, but find the shortest one. So instead of rooms or streets, each person that you have to "travel" through would be the node.

    I recommend a modified A* search algorithm. Have a look at this website, and instead of squares with multiple connections, picture characters with multiple connections:

    image
  • NizarisNizaris The Holy City of Mhaldor
    edited April 2013
    So here, we could define relationships as "steps" in our search to connect person A to person B. You assign a cost to taking each step, eg. parent/child relationship costs 1 (because a new generation is involved), sibling-sibling costs 0 (because there is no new generation). Then, the A* algorithm will always select the lowest total cost to get there.

    Examples:

    Bob and Susie are brother and sister. They decide to have a child, named Hank. Hank is both a nephew to Bob, as well as a child. This algorithm would report that he is a child, only, due to fewer steps [Bob -> Hank = 1 step, cost 1 (child); Bob -> Susie -> Hank = 2 steps, cost 1 (nephew)]

    Tim has a daughter, Grace. Then, Tim and Grace have another child: Nancy. From Grace's perspective, Nancy is both daughter and sister. But, this algorithm would report sister, only, but this time due to lower cost. [Grace -> Nancy = 1 step, cost 0 (sister); Grace -> Nancy = 1 step, cost 1 (daughter)]
    image
  • edited April 2013
    I want to find every relationship though, not just stop at the first one I find.

    Also, it's better modelled as a directed acyclic graph. With that, all I need to do is calculate the distance from certain common ancestor nodes, there doesn't need to be any real pathfinding involved, plus it saves on storage because all I need to know about a person is their parents. That part is easy. The only real problem is knowing which common ancestors matter.

    For example, say you're looking for the relationship between two people (named A and B). B is the daughter of A's sister and A's great grandfather. The lowest common ancestor is A's father (distance of 1 from A and 2 from B), and that tells us that B is A's niece. But A's great grandfather is another common ancestor that matters (because he's necessary to see that B is also A's great niece), and his distance is 3 from A and 1 (also 4, but that doesn't matter here) from B. If we just look for the shortest path or lowest common ancestor, we ignore that relationship.
  • NizarisNizaris The Holy City of Mhaldor
    @Sena: A* uses a branching search. It will evaluate multiple paths, but do so intelligently to return the lowest cost path to get there.

    Hrm. It's really pretty hard to describe. Work through the examples and pictures in that website I gave you.

    Actually, I think that you would end up using Dijkstra's algorithm, since I don't think that you'll be able to come up with a fast heurstic. That will broaden your search quite a bit.
    image
  • edited April 2013
    Like I said though (not sure if you saw my edit before you posted, but I gave an example of where it wouldn't work), finding the shortest or lowest cost path is useless to me in this case.

    Also, if I did use a more general pathfinding algorithm, I'd need to find the lowest cost path that goes through a specific node, not just the lowest cost in general. And then I'd still have exactly the same problem; I need a way to find out which nodes I need to find paths through.
  • NizarisNizaris The Holy City of Mhaldor
    Sena said:

    For example, say you're looking for the relationship between two people (named A and B). B is the daughter of A's sister and A's great grandfather. The lowest common ancestor is A's father (distance of 1 from A and 2 from B), and that tells us that B is A's niece. But A's great grandfather is another common ancestor that matters (because he's necessary to see that B is also A's great niece), and his distance is 3 from A and 1 (also 4, but that doesn't matter here) from B. If we just look for the shortest path or lowest common ancestor, we ignore that relationship.
    Score = distance + cost
    Cost = accrued when step results in generation cross.

    S = A's sister
    F = A's father
    G = A's grandfather
    T = A's great-grandfather

    B = daughter of S + T

    Path 1
    ---------
    A -> S -> B 
    S is sister of A. 1 step, cost 0. Step Score = 1.
    B is daughter of S. 1 step, cost 1. Step score = 2.
    Total score = 3.
    Result: Niece

    Path 2
    ---------
    A -> F -> G -> T -> B
    F is father of A. 1 step, cost 1. Step score = 2.
    G is father of F. 1 step, cost 1. Step score = 2.
    T is father of G. 1 step, cost 1. Step score = 2.
    B is daughter of G. 1 step, cost 1. Step score = 2.
    Total score = 8.
    Result = Great aunt.

    After running both possibilities, we finalize based upon which "path" has the lowest score (not distance), and select Path 1, because it is more closely related.
    image
  • With that, you can't tell the relationship based on the score. A score of 3 could be niece/nephew, or it could be uncle/aunt. A score of 8 could be great aunt/uncle, great niece/nephew, great great great grandparent, etc.

    And when you add the entire family tree, it's not practical to calculate every possible path (because I don't want the closest relationship, I want every relationship).

    With a DAG, the distance does tell you the relationship. Without incest, if person A's distance from the lowest common ancestor is 1 and person B's distance is 2, you know for certain that person B is person A's niece/nephew.

    The only complication added by incest (and so the only problem I need to solve) is not knowing which common ancestors I need to look at.
  • NizarisNizaris The Holy City of Mhaldor
    Ok. I misunderstood you and got it backwards. You want to list every relationship, but not explore every path. I thought that you wanted to list only the closest relationship, but explore every path.

    RE: score not giving you relationship. Yep. Relationship name (aunt, great-uncle, etc) would be calculated separately from the score. I had intended for the single best one to be selected in my algorithm based on the score, not for it to represent relationship name implicitly.
    image
  • Sena said:
    But then I remembered incest, and everything was ruined.
  • My current idea is to find every common ancestor with a path to either of the two people that doesn't go through another common ancestor. I can't find any counter-examples so far, but that doesn't mean there aren't any. Can anyone think of a situation where this wouldn't work?
  • Sena said:
    My current idea is to find every common ancestor with a path to either of the two people that doesn't go through another common ancestor. I can't find any counter-examples so far, but that doesn't mean there aren't any. Can anyone think of a situation where this wouldn't work?
    Is the restriction on not going through another common ancestor because you don't care about the extra relationships, or just because you don't want to deal with added complications with it? You do miss out on some relationships if you skip links that go through another common ancestor. For example, say Alice and Bob are first cousins, and their common grandparents are also first cousins. Then they share a path to their great great grandparents which also goes through their grandparents, making them both 1st cousins and 3rd cousins. If you just find common ancestors with a path to either that doesn't go through another common ancestor, you'll only list them as first cousins.
  • Eld said:
    Is the restriction on not going through another common ancestor because you don't care about the extra relationships, or just because you don't want to deal with added complications with it?
    It was an idea for ruling out irrelevant common ancestors without overlooking relevant common ancestors. But as you pointed out, it doesn't work.

    The reason I can't simply look at every common ancestor is that most of them will be useless (for example, looking at the distance from their grandparents for two siblings can't tell you whether they're siblings or first cousins), and I can't think of a way to check for those ambiguities.
Sign In or Register to comment.