PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 4th of May, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:11] <parallel21> brb
[00:00:44] <Douhan> I have a words collection. If a user visits a word page (/word/hello <- shows info about word: hello) ----> Where should I store the information about user visit to that page? Inside that word object or in another collection called "Statistics"?
[00:01:29] <Douhan> For example I want to count how many times word "hello" is visited?
[00:02:18] <StephenLynx> you could use find and modify
[00:02:23] <Douhan> Should I put { .. word: 'hello', viewCount: n .. } in a Word object or in a Statistics object?
[00:02:30] <StephenLynx> so you would update how much it was accessed and get it at the same time.
[00:02:47] <StephenLynx> and you could even upsert while at it.
[00:03:29] <Douhan> Sorry my English is bad, where should I put "viewCount: integer" to which object?
[00:03:37] <StephenLynx> the word.
[00:03:52] <Douhan> What if I want to store IP address and username who viewed it?
[00:04:01] <StephenLynx> no problem.
[00:04:13] <StephenLynx> you can push them to a list.
[00:04:25] <Douhan> Do I just store all that info in Word object? Why?
[00:04:25] <StephenLynx> you would have these fields on the word collection
[00:04:38] <StephenLynx> because that way you can do all you need in a single query
[00:04:40] <StephenLynx> the fields
[00:05:09] <StephenLynx> word, accessCount, accesses. acesses hold objects that hold the fields user, ip and whatever you want.
[00:05:22] <Douhan> Is querying more than 1 time considered unprofessional?
[00:05:32] <Douhan> Querying another collection I mean
[00:05:33] <StephenLynx> and you could even use the operator addToGroup so you wouldn't have repeated entries on acesses
[00:05:39] <Douhan> related to that Word collection
[00:05:42] <StephenLynx> think like this:
[00:06:02] <StephenLynx> if you have a car that uses a liter of gasoline to run X distance and another that uses 2 liters
[00:06:14] <StephenLynx> what car is better?
[00:06:17] <StephenLynx> which*
[00:06:30] <StephenLynx> if you can do a better job by querying it once, why query twice?
[00:06:58] <Douhan> How would I query which one is better?
[00:07:19] <StephenLynx> wat
[00:07:36] <Douhan> Nevermind :)
[03:34:49] <kakashiA1> I am using mongoose with angular. I have a list of 10 users and I can delete the third and fifth user with no problem but I dont know why this code is working: https://paste.xinu.at/Svwz2/js
[03:34:52] <kakashiA1> it is using the pop() method and that is what I dont understand
[03:48:09] <tejasmanohar> db.getCollection('tickets').find({user: ObjectId("5530c8d438748985f77743a0")},{draw: ObjectId("550e3e4051921e670ffc7f72")})
[03:48:20] <tejasmanohar> im trying to find tickets with user: as that object id and draw as that object id
[03:49:06] <joannac> okay
[03:49:28] <joannac> so it should be db.getCollection('tickets').find({{user: ObjectId("5530c8d438748985f77743a0"), draw: ObjectId("550e3e4051921e670ffc7f72")})
[03:51:12] <tejasmanohar> db.getCollection('tickets').find({user: ObjectId("5530c8d438748985f77743a0"), draw: ObjectId("550e3e4051921e670ffc7f72")}) ?
[03:51:19] <tejasmanohar> joannac: ^ ?
[03:52:34] <joannac> tejasmanohar: what's the question?
[03:52:48] <tejasmanohar> JonathanMcClare: is that correct syntax?
[03:52:55] <tejasmanohar> db.getCollection('tickets').find({user: ObjectId("5530c8d438748985f77743a0"), draw: ObjectId("550e3e4051921e670ffc7f72")})
[03:53:01] <tejasmanohar> * joannac :
[03:53:04] <joannac> tejasmanohar: looks right. try it and see
[03:53:24] <tejasmanohar> not pulling what i expected, maybe my data is actually not matching. ill look
[03:57:22] <kakashiA1> does anybody know why I got that behavior with mongoose and angular?
[03:58:00] <kakashiA1> why this code works, even if I delete the third or fifth user: https://paste.xinu.at/Svwz2/js
[08:36:27] <kkuno> hello
[08:36:47] <kkuno> I'm gonna set up a replica network
[08:36:53] <kkuno> I don't remember one thing:
[08:37:33] <kkuno> there should be a server which acts as a "client" for the replica?
[08:38:14] <kkuno> so that every request of the frontends points to that server which choose to redirect to the proper member of the replica set?
[08:53:32] <arussel> isn't there a way to create an index on a secondary without restarting it as standalone ?
[10:36:06] <hnsr> hey all. I'm wanting to install mongodb on a CentOS server, but I'm not sure if I should install it from CentOS's repos (which appears to be at version 2.4.x), or form mongodb's yum repos
[10:37:12] <hnsr> I usually prefer to stick to installing from official distro repos, but in this case the version seems to be kinda behind what mongodb's repos are offering, so I'm not sure which to pick
[10:48:10] <KekSi> hnsr: i'd say always use the official mongo repo
[10:48:54] <KekSi> official distro repos are usually outdated and don't keep up with updates/fixes
[10:52:16] <KekSi> especially for things like mongodb or docker (for example apt provides docker 1.0.1 and its now at 1.6)
[10:52:43] <hnsr> I see, going to use mongo's repo then, thanks!
[10:53:34] <KekSi> i think mongo 2.4 is > 2 years old now and lacks *a lot* of features
[10:54:01] <hnsr> ouch, okay
[11:21:00] <ignasr> hi, I get IO spikes every hour, iotop show mongo doing it. Is this some kind of garbage collection, or flush? I can't find enything
[11:22:15] <ignasr> *anything
[12:13:09] <salty-horse> I have a large collection. I'd like to count all the elements matching a specific prefix: {_id: /^foo/}. It's slow. I wonder if it would perform faster if I had a separate field with just the prefix. does anyone have an idea if that could at all help?
[12:47:03] <deathanchor> salty-horse: indexing that field would help, but yes if you put the prefix in its own field and index that field it would fastest if the regex is not required for the new field.
[12:47:36] <deathanchor> try a .explain() on the query to see why it could be slow
[12:49:47] <salty-horse> DA-away: I can't explain a count(), AFAIK. is it enough to simulate it with a find({}, {_id:1}) ?
[12:53:53] <deathanchor> salty-horse: yes you can change count to find and add the .explain()
[12:54:20] <deathanchor> a count runs the same query as a find.
[13:43:10] <salty-horse> deathanchor: stuck in find({}, {_id: 1}).explain() for 32 minutes
[14:05:00] <deathanchor> it will return eventually if you count took just as long
[14:05:04] <deathanchor> wait...
[14:05:11] <deathanchor> why are you doing an empty {} search?
[14:05:18] <deathanchor> is that how you are counting?
[14:06:16] <salty-horse> deathanchor: no, I wanted to test it like that first, so I'd have something to compare it to. when I cancelled it it finished 3/5 of the shards. will try again later
[14:06:43] <deathanchor> you should just test with the query that you know is slow
[14:07:18] <deathanchor> if you want an example output, just run a fast query for a specific indexed item.
[14:11:31] <salty-horse> deathanchor: I'm not sure I follow, but perhaps I'll wasn't clear. I'll just run explain on the prefix query and wait
[15:10:28] <jecran> Hello. New to mongo, using node.js. Can someone please tell me the proper way to CREATE a database through code? I found plenty of examples, on how to manipulate pre-made databases lol
[15:10:52] <cheeser> you don't need to create them, necessarily. just start writing to a collection.
[15:11:07] <StephenLynx> that
[15:11:09] <StephenLynx> and 2nd
[15:11:14] <StephenLynx> I suggest migrating to io.js
[15:11:21] <StephenLynx> it is everything node.js is but better, jecran
[15:12:06] <jecran> StephenLynx: node.js is my requirement lol.
[15:12:15] <StephenLynx> really?
[15:12:16] <StephenLynx> why?
[15:12:29] <jecran> cheeser: any naming conventions?
[15:12:37] <StephenLynx> because it is retro compatible.
[15:12:38] <jecran> StephenLynx: cause my boss says so :P
[15:12:52] <StephenLynx> tell your boss io.js is faster and more stable and all code works in it.
[15:13:06] <jecran> haha you tell him
[15:13:27] <StephenLynx> I told my boss he should ditch PHP for node, then migrated to io.js when it came out.
[15:13:30] <StephenLynx> I did my part.
[15:13:56] <StephenLynx> still fighting that war, almost won a battle.
[15:14:05] <jecran> Main job is still PHP. This is a side project to keep my hours flowing
[15:14:29] <StephenLynx> still, showing initiative is always a good thing.
[15:14:50] <StephenLynx> switching from node to io will have zero compromises, only advantages.
[15:15:22] <GothAlice> And learning curve, since there are differences or you wouldn't be recommending one over the other.
[15:15:32] <jecran> Are you a salesman? You should definitely pitch this
[15:15:34] <GothAlice> Practicality beats purity…
[15:15:35] <GothAlice> Yeah.
[15:15:52] <StephenLynx> I am an user. and there are no differences for developers.
[15:16:04] <StephenLynx> I just like to encourage adoption of better tech.
[15:16:13] <GothAlice> Mmmhmm.
[15:16:31] <StephenLynx> I migrated from node to io, all of my code worked flawlessly.
[15:16:49] <StephenLynx> it was a drop-in replacement, like mariadb for mysql.
[15:17:19] <GothAlice> … except MariaDB isn't a drop-in replacement if you want to use it properly.
[15:17:45] <StephenLynx> and really, showing initiative goes a long way in your job, jecran. even if you don't want to actually fight, at least proposing and showing you did your homework is good for you.
[15:18:35] <GothAlice> This isn't a language advocacy channel, AFIK. :/ There's a difference between "I use X because it solves problem Y for me" (such as "I use MongoEngine on Python because of event/signal and cascade support") and "stop what you're doing and use X because it's awesome" while someone is in the middle of something.
[15:18:39] <cheeser> well, anytime you're using mysql, you've already given up on correctness
[15:18:40] <cheeser> :D
[15:18:42] <GothAlice> If I had an employee come to me with a technology switch mid-project I'd have to think long and no about that.
[15:18:46] <jecran> If I knew more about it I might lol...... I am still entry level and most of this stuff is relatively new to me (node and mongo)
[15:18:55] <cheeser> GothAlice++
[15:18:57] <StephenLynx> and I am not advocating for any language.
[15:19:22] <GothAlice> You are advocating one implementation of a language over another. You are advocating for a language.
[15:19:38] <StephenLynx> no because they both use V8, which is the same implementation.
[15:19:54] <StephenLynx> I am not advocating over any language.
[15:20:07] <StephenLynx> io.js could use lolcode and I wouldn't still touch the language subject.
[15:20:27] <StephenLynx> I really don't care about javascript, like I don't care about the color of my underwear.
[15:20:55] <jecran> You guys r funny
[15:21:41] <GothAlice> StephenLynx: You like any colour underwear as long as it is black. (I.e. bashing on about io.js over others, when everyone clearly has already got your point.) The signal to noise ratio today is somewhat off. :(
[15:21:49] <GothAlice> jecran: Sometimes we try to be. ;) Other times less so.
[15:21:54] <StephenLynx> I am not bashing node.
[15:22:02] <StephenLynx> I am promoting objectively better tech.
[15:22:06] <GothAlice> "bashing on about" / "going on about"
[15:22:14] <StephenLynx> because he argued back.
[15:22:25] <StephenLynx> if he just don't care, won't do, ok.
[15:22:27] <StephenLynx> I will stop.
[15:23:00] <StephenLynx> it won't keep me from promoting a technology that I know for a fact is superior for the next guy mentioning node.js.
[15:23:33] <cheeser> i love the comingling of "objectively" and "better"
[15:23:41] <StephenLynx> I got benchmarks.
[15:23:46] <jecran> So on a serious note, I have mongo installed into my node project using NPM. Should I install mongodb separate?
[15:23:49] <StephenLynx> I know the release history.
[15:24:02] <GothAlice> There's an efficiency question about your approach. Would it not be more efficient to join #node.js and advocate there, to an audience who predominantly are interested in such subject matter?
[15:24:10] <StephenLynx> you are referring to the module that is the driver, arent you?
[15:24:14] <StephenLynx> no, use npm.
[15:24:22] <StephenLynx> it will be easier to keep it updated and stuff.
[15:24:31] <GothAlice> jecran: MongoDB server components are separate from the client drivers.
[15:24:46] <GothAlice> jecran: So simply NPM installing a client driver won't get you a working database for local development.
[15:25:31] <StephenLynx> I won't do that for the same reason you don't think its reasonable for people to come here and tell us to stop using mongo, GothAlice.
[15:25:33] <jecran> GothAlice: the answer I was looking for :)
[15:38:21] <GothAlice> StephenLynx: Interestingly enough, I do find it quite reasonable for people to suggest databases other than MongoDB when discussing database design. MongoDB isn't the best for graphs, and isn't the best for anything requiring true relational integrity with ACID transactions (excluding the TokuMX fork). A real graph database or relational database may be a more valid solution to a given problem.
[15:38:27] <GothAlice> Practicality beatus purity. :)
[15:38:51] <GothAlice> However, switching .io runtimes solves no problem. It's pure noise.
[15:38:52] <StephenLynx> you completely missed my point.
[15:39:02] <cheeser> neo4j is awesome, e.g.
[15:39:10] <GothAlice> cheeser: +9001 (that's over nine thousand)
[15:39:43] <StephenLynx> one thing is suggesting that mongo might not be the best solution for a problem. I suggested more than once for people here that they were better if they picked a different tool for their particular problem.
[15:39:58] <StephenLynx> what you suggest is that I would just sit here all day telling people to not use mongo.
[15:40:13] <StephenLynx> we are talking about oranges and apples.
[15:42:02] <StephenLynx> if we are here, on a channel dedicated to the tool, discussing if we make general use of said tool is not an option.
[15:42:50] <StephenLynx> if one don't think the tool this channel is dedicated tool is worth of any use at all, one can go away and use w/e he wants. that is the reason I don't go to the channel dedicated to node to tell people to not use node.
[15:43:05] <StephenLynx> if one don't think the tool this channel is dedicated to*
[15:43:35] <jecran> Where is the luv
[15:44:10] <cheeser> fergie has it.
[15:44:28] <GothAlice> I'm suggesting that recommendations "to use a better tool" (runtime) solves nothing, especially a MongoDB-related problem, and is noise for everyone in the channel not using or interested in JS. Noise that obfuscates or otherwise makes the MongoDB-related conversation more difficult to follow. JS performance isn't the problem, something to do with MongoDB is the problem; thus: wrong audience.
[15:45:03] <GothAlice> Fergie never lets go of the wuv. (This Earth concept of Wuv, with a Human double-u, confuses and infuriates us! /Futurama)
[15:46:04] <StephenLynx> if we were to be strict about what we discuss in this channel, you wouldn't get to recommend prices or tell people that using cloud is a bad idea, that doesn't help solve mongo issues either.
[15:46:17] <StephenLynx> recommend services based on prices*
[15:46:41] <cheeser> "i need to back up X gigs" "using blah it'd cost you $$$" problem solved.
[15:46:52] <StephenLynx> still not a mongo issue.
[15:47:43] <StephenLynx> "my runtime environment haven't been update since forever and is slow" "switching to this fork helps that" problem solved.
[15:48:09] <cheeser> 1. it kind of is. 2. i'm really not interested in a dick measuring contest so i'll just go back to my code.
[15:50:14] <jecran> cheeser: my new hero lol
[16:00:54] <jecran> So I am making some progress. I think. I have been trying for a few minutes now to display a list of all the databases, and the only output I am getting is 'local 0.078GB' not the names. I have tried 'show dbs', and 'db.getCollections()'. What am i doing wrong?
[16:02:39] <cheeser> you have one database called local
[16:04:10] <jecran> The default database it currently connects to when I start is 'test'. I think I created another db, it looks like I connect to it with the 'use' command
[16:06:06] <jecran> I have also used db.dropDatabase() on test and it is not going away haha
[16:54:12] <dunkel2> hello
[16:54:37] <dunkel2> if i want to protect my mongodb with iptables, should i set the real ip or 127.0.0.1 ?
[16:56:10] <StephenLynx> isn't binding enough?
[16:56:34] <dunkel2> is it?
[16:57:42] <StephenLynx> it is
[16:57:43] <StephenLynx> :v
[17:01:31] <GothAlice> dunkel2: http://docs.mongodb.org/manual/reference/configuration-options/#net.bindIp < MongoDB is configured to listen to specific network interfaces, by IP address. If your DB host has three interfaces (i.e. lo or loopback on 127.0.0.1, eth0 with a public address, and eth1 with an internal "management" address is typical for servers) you either need to explicitly tell MongoDB to listen for connections on these addresses, or specify 0.0.0.0
[17:01:31] <GothAlice> to indicate "all".
[17:01:42] <GothAlice> Only if you open MongoDB up (it defaults to loopback only) do you need to worry about firewall rules.
[17:02:25] <dunkel2> mmm maybe not, i just have a little app using mongodb
[17:02:47] <GothAlice> If your app is running on the same host as MongoDB (generally not a good idea, but possibly OK for a small enough app) then yeah, no need to open MongoDB up at all. :)
[17:03:06] <dunkel2> yes it is running on the same server and host
[17:03:16] <dunkel2> why is not a good idea ? :O
[17:03:27] <GothAlice> Because database services like MongoDB tend to want to eat all RAM they have access to.
[17:03:41] <GothAlice> This makes it generally unwise to share a server (or VM) between multiple services.
[17:03:42] <dunkel2> omg!
[17:04:18] <GothAlice> If all of your data fits in RAM, it becomes really fast. So databases want to get as much data into RAM as possible.
[17:04:20] <dunkel2> then for a big app i better get a server only for the mongodb service?
[17:04:30] <cheeser> wouldnt' hurt
[17:04:54] <GothAlice> Yes. For anything larger than a "really tiny app" it's best to have three servers for your database, that way if something goes wrong and one dies, the application need not be interrupted.
[17:05:02] <dunkel2> and that way i will need to bind the mongo to the ip of the app server?
[17:05:31] <dunkel2> or having it that way i will need iprules?
[17:05:38] <GothAlice> Correct. Likely you can get away with binding to the "internal service network" address, and not the public one. That way you only really need to worry about firewall locking down to the servers you care about, but there's near-zero risk of public access while you're setting it up.
[17:05:40] <dunkel2> iptables rules
[17:09:10] <dunkel2> ok thanks GothAlice ill have that in mind I hope by now my small app doesnt give me any problems in the server
[17:09:18] <GothAlice> It shouldn't. :)
[17:09:44] <GothAlice> (Real applications requesting memory take priority over the memory mapped file caches MongoDB uses.)
[17:09:57] <StephenLynx> it isn't possible to apply different updates to different documents in a single update operation, isn't?
[17:10:40] <GothAlice> StephenLynx: No; technically certain operations like $inc often result in different final values in the resulting documents, but the individual operations are uniformly applied.
[17:24:08] <dunkel2> GothAlice: can you help me? im trying to set the auth on my mongodb installation but i can’t get it working, i still can connect using the mongo command and it let me use dbs
[17:24:34] <dunkel2> it just does not allow me to show dbs or collections, but i think i can use and maybe write?
[17:24:45] <GothAlice> dunkel2: Have you enabled --auth in the command line or configuration file?
[17:25:02] <dunkel2> i did it in the config
[17:25:03] <GothAlice> dunkel2: And have you created your first ("root") administrative user?
[17:25:04] <dunkel2> auth=true
[17:25:05] <pamp> you cant
[17:25:35] <dunkel2> i run mongo, then use the admin db and create my user
[17:26:02] <dunkel2> like this http://docs.mongodb.org/manual/tutorial/add-user-administrator/#create-the-system-user-administrator
[17:26:19] <GothAlice> Perfect.
[17:26:33] <GothAlice> After that point, you'll need to use that user to perform other actions, such as adding database-specific users.
[17:26:58] <GothAlice> Even with authentication enabled, you'll be able to connect. And seemingly select databases. But without authenticating you won't be able to do anything "real" with that connection.
[17:29:25] <dunkel2> cool i just made a test
[17:29:38] <dunkel2> now… sorry for the newb question but how do i authenticate?
[17:31:13] <GothAlice> dunkel2: On the command line tools, using -u and -p for user and password. (If you specify -p as the final option, it will prompt you to enter the password so as to not pollute your shell history with that confidential information.
[17:31:29] <GothAlice> dunkel2: With any command-line tool, run it with "--help" to get a list of all options and possibly some examples.
[17:31:35] <dunkel2> and the db.auth?
[17:32:22] <GothAlice> That's a shell command to authenticate after connection. (Command line options do this for you.)
[17:32:33] <GothAlice> At the application level, you'll have to examine the documentation for the MongoDB client driver you use.
[17:32:48] <dunkel2> ok, i use mongoose let me check that
[17:32:54] <dunkel2> thanks again GothAlice
[17:33:05] <GothAlice> mongo --port 27017 -u someUser --authenticationDatabase admin -p
[17:33:16] <GothAlice> That'd be a quick test, given an appropriate "someUser" name.
[17:34:53] <GothAlice> For further reading, please consider: http://docs.mongodb.org/manual/tutorial/add-user-to-database/
[17:34:54] <GothAlice> dunkel2: ^
[17:34:58] <GothAlice> :)
[17:35:16] <eprime> hi, i'm trying to use $dayOfYear but am getting the following message: Can't handle date values outside of time_t range, code 16421... all my date fields contain an ISODate, anyone knows what goes wrong there? my group looks like this: { $group: { _id: { $dayOfYear: "$date"}}}
[17:35:21] <dunkel2> yeah i jsut did :D but with the db.auth
[17:35:33] <dunkel2> i am trying the line you gave me
[17:35:41] <dunkel2> and i get Error: 18 Authentication failed.
[17:35:52] <GothAlice> dunkel2: Maker sure you're using the right username, there. ;)
[17:36:18] <dunkel2> yeah! i had a typo
[17:36:21] <dunkel2> thanks!
[17:37:02] <GothAlice> eprime: Double-check that your dates are, in fact, all ISODate instances. UNIX timestamps would explode. That error is notably stating that your provided date value is outside the valid range for dates. (I.e. way too far in the future, or earlier than the epoch which I believe is Jan 1st 1970.)
[17:39:05] <GothAlice> Ah, sorry, the time_t range is December 13th 1901 to January 19th 2038 for 32-bit time_t structures. 64-bit would extend quite a bit beyond that end date. (I.e. 20x the present age of the universe).
[17:39:51] <pjammer> Does replica sets help with slow writes by allowing reads on the primary, at all? Or is replica sets strictly for "primary is down for 10s, get a new one by electing from the secondary pool"
[17:40:51] <eprime> GothAlice, thanks! maybe it's the 2038], let me check
[17:41:15] <GothAlice> pjammer: joannac has an excellent article I don't have a link handy for that covers what types of performance improvements different strategies (replication, sharding) can offer. All writes MUST go to a primary, and there's only one primary in a bare replica set, so you get no particular gain in write performance.
[17:41:48] <pjammer> thanks for the name and will google for that link.
[17:41:58] <pjammer> and the info too lol
[17:42:27] <GothAlice> (And those writes are replicated to the secondaries, too, meaning write load is amplified by the size of the replica set; reads on a secondary will still need to wait for write locks.)
[17:55:59] <eprime> GothAlice, all good, thank you very much indeed :)
[17:56:11] <GothAlice> It never hurts to help. :)
[18:00:05] <dunkel2> GothAlice: hello again :) im having an authentication problem
[18:00:13] <dunkel2> i hope you can help me again
[18:00:30] <dunkel2> i am authenticated to the db but i still cant show collections
[18:01:02] <dunkel2> not authorized on publifi to execute command { listCollections: 1.0 }"
[18:02:39] <GothAlice> What permissions did you give the user you are attempting to use?
[18:03:33] <dunkel2> role" : "userAdmin",
[18:03:43] <GothAlice> userAdmin lets that user create and manage users, nothing more. ;)
[18:03:52] <dunkel2> :O!!!
[18:03:56] <dunkel2> then thats why
[18:04:06] <GothAlice> Ref: http://docs.mongodb.org/manual/reference/built-in-roles/
[18:04:37] <GothAlice> (At least you should be able to use that user to give itself more permissions. Ha! ;)
[18:04:48] <dunkel2> xD
[18:04:59] <dunkel2> yeah i think readwrite is enough
[18:06:04] <GothAlice> I have several users set up in my main cluster at work: amcgregor (me) @admin for general admin access. Then app@staging and app@production with readWrite on the respective DB.
[18:06:36] <GothAlice> (Each app user has the same name, but different password for each database.)
[18:08:55] <dunkel2> and it is on! :D thank you so much
[18:10:07] <GothAlice> No worries. :)
[18:19:24] <GothAlice> BTW, Derick and the MongoDB folks, SSL authentication with a PIV smart card token is _utterly_ boss.
[18:22:32] <GothAlice> Over the last week I've been integrating https://www.yubico.com/products/yubikey-hardware/yubikey-neo/ into my workflow; so far, so good! (Excluding a bit of a hiccup relating to a CVE against the OpenPGP app on the smart card… that lets the token sign anything at all without needing PIN entry…)
[18:45:40] <Thinh> Hey GothAlice, do you know any good mocking packages for pymongo?
[18:46:10] <GothAlice> Thinh: Depends on your requirements; real mocking, or will fixtures suffice?
[18:46:38] <Thinh> I'd prefer something like mongomock
[18:46:46] <cheeser> mocks--
[18:46:51] <Thinh> https://github.com/vmalloc/mongomock
[18:47:33] <GothAlice> Indeed, Mongomock is a thing. The difficulty: you're not really testing against MongoDB, thus any guarantees that it can provide are scoped to "your code works with Mongomock! Congratulations!" without necessarily implying anything further.
[18:47:44] <Thinh> haha
[18:48:00] <cheeser> one of several reasons mocks are a bad choice.
[18:48:03] <GothAlice> I always test against a real MongoDB version.
[18:48:58] <Thinh> cool, thanks for your input
[18:49:07] <GothAlice> https://gist.github.com/amcgregor/c33da0d76350f7018875 < is the script that gets spun up prior to running my tests, and shut down after. It runs a full 2x3 sharded replica set on a single host, on a non-standard port (for the mongos router especially) so as to not conflict with any solo development service you may have running.
[18:49:32] <GothAlice> Clean-up is as simple as rm -rf $BASE/*
[18:50:13] <Thinh> That's actually quite nice,t hanks
[18:50:21] <GothAlice> :)
[18:50:43] <GothAlice> 'Cause if you use sharding, it's important to test how it balances and how query performance may be affected by alternate shard index strategies. :)
[18:50:59] <GothAlice> (Testing sharding strategies was the original reason I wrote that script.)
[18:51:03] <Thinh> Good point
[18:51:21] <Thinh> I'm not using sharding yet but it'll definitely come into play soon
[19:09:42] <tpayne> how do i backup my collections?
[19:10:04] <tpayne> unless it's done for me automatically, i'm on EC2/AWS
[19:11:18] <GothAlice> tpayne: https://gist.github.com/amcgregor/4fb7052ce3166e2612ab#backups
[19:12:36] <tpayne> GothAlice: thanks!
[19:40:18] <iprime> GothAlice, all my documents in a collection contain just these two fields: UUID, date. Do you think that using the aggregation framework there would be a way to collect all UUIDs for a certain date, where there isn't a document with the same UUID for an earlier date?
[19:42:52] <GothAlice> iprime: Hmm. Not that I can think of right off the bat.
[19:43:09] <StephenLynx> where there isn't a document with the same uuid for an earlier date?
[19:43:32] <GothAlice> Sounds like you're going to need to query for all UUIDs in that previous time, then explicitly exclude those in a second query for the future timeframe.
[19:43:32] <StephenLynx> that is where it looks to complicates. you could get all UUIDS for a certain date though.
[19:43:52] <StephenLynx> to complicate*
[19:44:08] <StephenLynx> it seems* , damn, did I messed up there.
[19:44:17] <StephenLynx> mess*
[19:44:18] <StephenLynx> fug
[19:45:06] <deathanchor> why not just search for the UUIDs of the date you want, then just look up the UUID for older times later (assuming you indexed by UUID).
[19:45:48] <StephenLynx> you could first group by date, then group by UUID though.
[19:46:02] <GothAlice> deathanchor: Pre-calculating exclusions, rather than confirming inclusions after the fact, are two sides of the same coin. Which is optimum will depend on the number of values matched by each. (I.e. are there fewer values to include, or exclude?)
[19:46:05] <StephenLynx> wait, no
[19:46:33] <StephenLynx> iprime, could you give an example of input / output?
[19:49:51] <deathanchor> GothAlice: agreed, but most times #oldItems > #itemsWanted
[19:50:26] <deathanchor> especially when dealing with time over time.
[19:59:44] <iprime> StephenLynx, http://hastebin.com/alirodehap.sm
[20:00:16] <StephenLynx> and the desired output?
[20:00:39] <iprime> in the example what I want is that the aggregation only returns 1 result: "asdf1" and not return "asdf"
[20:01:32] <iprime> i could group by uuid and use $addToSet to collect all dates for each id, but i wonder if that could be of any use
[20:02:20] <iprime> maybe in another pass i could project/match $size of dates == 1 and date is today
[20:02:30] <iprime> not sure if this can be done using aggregation alone
[20:04:50] <Derick> GothAlice: you use it to auth with MongoDB?
[20:06:13] <GothAlice> Derick: That's what I'll be hacking on tonight. So far PIV is working for the CA I use, SSH, and a bunch of other services. Here's hoping the mongod binaries dynamically link against OpenSSL. ;)
[20:09:14] <Derick> GothAlice: I doubt it
[20:10:16] <GothAlice> Well, the hope shifts then to it being a new enough statically linked OpenSSL to include the needed driver for my token.
[20:15:33] <GothAlice> Also I need to figure out the virtual path for this device. :/ MongoDB won't ever have direct access to the "keyfile".
[20:46:45] <iprime> StephenLynx, GothAlice, I think I've hacked it: http://hastebin.com/edihelohis.sm
[20:47:10] <iprime> still it really feels like a hack
[20:47:59] <iprime> and this only works because i want to match against 'today', i imagine i'd have to calc the 'max date' if this has to work for any given date
[20:51:47] <iprime> hmm, actually no need for maxdate check
[23:11:15] <pjammer> Two part question: $ne queries can't be indexed, correct? and does .count() use an index if one is available?
[23:12:09] <pjammer> does no index also apply to $nor as well?
[23:15:32] <GothAlice> pjammer: .count() uses "an index", but not in the way one might expect. This has repercussions involving inaccuracy on sharded clusters.
[23:16:07] <pjammer> thanks.
[23:16:34] <GothAlice> I'm currently digging around for an answer to your other question. :)
[23:17:59] <pjammer> thanks. i've been asked to shard, but I have to believe if we add a just functioning standalone DB, sharded, I fear we're just having 3 replica sets of the same crud.
[23:18:30] <pjammer> from all the vids and talks i've seen, best to exhaust indexes and such.
[23:19:21] <pjammer> all the hardware stuff we're doing. mind you there is a lot about noatime but not sure i heard about ubuntu's reltime (may have mistyped that name).
[23:19:28] <GothAlice> Ha, darn you MongoDB for optimizing my simplified test case away! Stop using "IDHACK" as your query plan! XD
[23:20:04] <GothAlice> What type of value are you $ne'ing?
[23:20:06] <GothAlice> Integer, string?
[23:21:05] <pjammer> column is called hidden, just checking what it is in there.
[23:21:22] <GothAlice> pjammer: From what I can see, $ne does use an index. (An index scan, specifically.)
[23:22:21] <pjammer> I saw this from mtools' blog on mongodb.org $ne queries are known to be inefficient since these queries cannot use an index, resulting in a high number of documents scanned.
[23:22:29] <pjammer> http://blog.mongodb.org/post/85123256973/introducing-mtools
[23:22:55] <GothAlice> $not with a sub-expression, however, avoids indexes completely.
[23:24:03] <GothAlice> Yeah, it's scanning the index. Slightly better than a document scan, but not much.
[23:24:39] <GothAlice> (In my case I was testing for == and != an ObjectId. Other field types may behave differently when indexed.)
[23:24:55] <pjammer> So when you guys come across code bits like this, do you rework to get rid of the $ne
[23:25:05] <GothAlice> If possible, yes.
[23:25:09] <GothAlice> It's not always possible. ;)
[23:25:21] <pjammer> But typical refactor protocol
[23:25:32] <Zelest> *yawns*
[23:41:32] <Streemo> Hi, I was reading in the docs: The circle’s radius, as measured in the units used by the coordinate system.
[23:41:43] <Streemo> then they use an example with r = 10
[23:41:46] <Streemo> 10 what?
[23:41:51] <Streemo> How does one specify the units
[23:42:17] <Streemo> 1 unit of lng?
[23:42:26] <GothAlice> One unit in whatever coordinate system you are using.
[23:42:37] <GothAlice> "A vector with length one."
[23:42:48] <Streemo> right, so it'd be in units of lng?
[23:42:55] <GothAlice> What's "lng"?
[23:42:58] <Streemo> longitude
[23:43:09] <Streemo> my pair is : lng, lat
[23:43:44] <GothAlice> That doesn't make sense. lat/lng is a coordinate, but what are they _measuring_, that is the question. (The answer boils down to radians.)
[23:44:01] <Streemo> im trying to find all docs within acircle of radius 5 miles
[23:44:01] <GothAlice> I'm assuming your coordinate system is on the surface of a sphere.
[23:44:11] <GothAlice> AH!
[23:45:08] <Streemo> and yeah radians, but i can convert that to distance
[23:45:20] <Streemo> the quesiton is, does mongo make me dothe conversion myself or is there an easier way
[23:45:28] <GothAlice> Distance is in meters if using GeoJSON points and $minDistance/$maxDistance.
[23:45:41] <Streemo> see thats perfect, GeoJSON?
[23:45:59] <Streemo> $geometry?
[23:46:14] <GothAlice> Yes; that is the preferred method for geographic data storage; it gives the widest range of capability. http://docs.mongodb.org/manual/reference/glossary/#term-geojson
[23:46:19] <GothAlice> Yes to the former, not the latter.
[23:47:11] <Streemo> "geometry": {"type": "Point", "coordinates": [102.0, 0.5]},
[23:48:00] <Streemo> each document has a single location on the earth's surface
[23:48:05] <Streemo> so i'd use point
[23:48:08] <GothAlice> { "type": "Point", "coordinates": [100.0, 0.0] }
[23:48:39] <GothAlice> http://docs.mongodb.org/manual/tutorial/build-a-2dsphere-index/#procedure demonstrates inserting two points.
[23:49:41] <Streemo> Thanks Alice, helpful as usual :)
[23:50:40] <GothAlice> http://docs.mongodb.org/manual/tutorial/query-a-2dsphere-index/#proximity-to-a-geojson-point demonstrates how to query them the way you are wanting. (Distance in meters.) Or, in radians if you *really* want to go that route: http://docs.mongodb.org/manual/tutorial/query-a-2dsphere-index/#points-within-a-circle-defined-on-a-sphere (if you convert miles to radians by doing the division, just use meters ;)
[23:52:19] <Streemo> yeah i have no desire to deal with arc lengths of great circles of the earth
[23:52:25] <GothAlice> Yuuuuup.
[23:52:48] <Streemo> :x
[23:53:19] <Streemo> using near autosorts them.
[23:53:25] <Streemo> i think i'll do within instead
[23:53:25] <GothAlice> … yes.
[23:53:28] <GothAlice> ^_^
[23:53:32] <GothAlice> I enjoy non-Euclidian spaces.
[23:54:22] <StephenLynx> I once considered using source engine only so I could make non-euclidean levels for a game
[23:54:54] <StephenLynx> because UDK portals were not very natural, they had a very clear transition.
[23:55:02] <GothAlice> StephenLynx: Duke Nukem 3D made it quite easy, back in the day. Realtime preview and everything, too.
[23:55:27] <StephenLynx> yeah, id tech2 had this workaround for that.
[23:55:28] <Streemo> well sometimes non euclidean space makes things easier
[23:55:35] <StephenLynx> but by no means it was realistic.
[23:55:58] <StephenLynx> with source engine you literally can't tell the difference when you cross a portal
[23:56:20] <Streemo> for example, if your level was for hte death star
[23:56:33] <StephenLynx> they work so well, valve incorporated it usage in basic level construction
[23:56:39] <StephenLynx> for portal
[23:58:41] <Streemo> GothAlice: is there a similar modern syntax for geoWithin which has a similar usage as "$near"?
[23:59:24] <Streemo> apparently it was deprecated?
[23:59:55] <GothAlice> Alas, radians are where it's at now.