PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 16th of September, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[01:02:05] <windsurf_> Is it possible to use $group to come up with a query that will give me a $sum of how many of each value a particular field has?
[01:02:51] <windsurf_> my documents have a 'status' property and the value can be one of 12 values. I want a resulting object with a count for each status value
[01:08:31] <joannac> windsurf_: yes?
[01:08:50] <windsurf_> worth pursuing then, thanks
[01:09:27] <joannac> windsurf_: https://docs.mongodb.com/manual/reference/operator/aggregation/group/#group-by-month-day-and-year <--- that example might be useful
[01:12:55] <windsurf_> thank you @joannac
[01:20:06] <windsurf_> joannac I'm having trouble with the syntax. If I just keep the _id line and have a count:{$sum:1} then I get how many parts there are per jobID but what I really want is a count of how many of each status value. This code isn't working but am I getting close? http://pastebin.com/4fhpr76C
[01:22:09] <joannac> if you don't want to group on jobId, then don't group on jobID
[01:22:12] <joannac> group on status
[01:22:28] <windsurf_> hm ok
[01:22:35] <windsurf_> i kind of need to do it on both
[01:22:40] <windsurf_> but i'll try both
[01:33:38] <windsurf_> hm.
[01:33:47] <windsurf_> that's getting close to what I need
[01:33:52] <windsurf_> was hoping to have multiple counts per document
[01:36:10] <windsurf_> @joannac I could work with this http://pastebin.com/GLnEne86 but I have a feeling I can get Mongo to do more via the query
[01:36:42] <windsurf_> I do need to ultimately know for each job, how many Part documents have status AAA, how many have BBB...
[01:36:56] <windsurf_> would be nice if each job were one document with those Part totals
[02:11:55] <bros> I have a replset with 2 members right now.
[02:11:58] <bros> If I bring the primary down
[02:12:03] <bros> my app doesn't switch to the secondary
[02:12:12] <bros> Why? Is it my connection string?
[02:12:38] <cheeser> do you list more than one host in your connection string?
[02:12:42] <bros> Yes.
[02:13:05] <cheeser> do you get an error trying to write?
[02:14:53] <Boomtime> ->"I have a replset with 2 members right now" -- without either one of these, there is no primary
[02:14:59] <bros> I get timeout errors.
[02:15:44] <bros> I actually can't bring down either one
[02:15:48] <bros> without my entire app crashing...
[02:15:55] <cheeser> did retry your update? you might be hitting the election cycle
[02:16:54] <bros> All I know is if I kill either one of the nodes, my entire app crashes...
[02:17:00] <bros> Isn't that the exact opposite of what is supposed to happen?
[02:17:46] <cheeser> your app shouldn't crash, no, but that's on you to detect the write failure and handle accordingly.
[02:17:55] <cheeser> which language?
[02:18:29] <bros> So, the queries should time out?
[02:18:31] <bros> here is what I don't get
[02:18:34] <bros> I just brought my secondary down
[02:18:38] <bros> and now my app times out...
[02:18:47] <bros> Why?
[02:19:22] <cheeser> the driver knows what the primary was and it tries to write to that primary. sockets take time to time out via the OS signals.
[02:19:28] <cheeser> what language are you using?
[02:20:57] <bros> node.js
[02:21:08] <cheeser> ah. not sure how that driver handles such things.
[02:21:13] <bros> I have no retry code because I typically don't see this, at all.
[02:21:17] <bros> Database timing out is code red to me.
[02:21:52] <cheeser> thread-capable drivers have threads monitoring the cluster and can find the new primary
[02:22:15] <cheeser> i think drivers like the node and php drivers rely on timeouts or write errors to find out there's a new primary
[02:38:30] <bros> cheeser: I thought the whole "secondary going down" would be more seamless
[02:38:34] <bros> What's the default timeout? 30s?
[03:22:31] <cheeser> a secondary going down isn't usually a problem
[04:01:03] <bros> no valid replicaset members found
[04:01:05] <bros> cheeser:
[04:01:51] <joannac> is that after you take one node down?
[04:01:55] <joannac> and so there's no primary?
[07:00:57] <gokl> Hi, regarding _id as default ObjectId() field, is it garantueed unique across a sharded collection? Or is it only unique in combination with the shard key? Is there something in the docs about it? I couldn't find.
[07:03:23] <joannac> gokl: _id is not guaranteed unique across a sharded collection
[09:22:36] <amitprakash> How do I get a cursor via listIndexes ?
[09:23:02] <amitprakash> I tried db.collection.getIndexes({listIndexes: collection_name}).cursor, but that doesn't seem to work
[09:23:10] <amitprakash> since the output for getIndexes is an array
[09:24:14] <Derick> which language and driver?
[09:24:48] <Zelest> swedish... reverse...
[09:24:49] <Zelest> :D
[09:25:14] <amitprakash> Derick, mongo shell?
[09:25:40] <Derick> amitprakash: not sure whether you can in the shell - what do you need a cursor for? Isn't the array enough?
[09:27:32] <amitprakash> Derick, to filter on returned indexes by params
[09:27:58] <Derick> the listIndexes command has a filter option for that
[09:28:16] <amitprakash> Derick, right, how do I do this?
[09:28:25] <amitprakash> Can't seem to find the command listIndexes in mongo shell
[09:28:35] <amitprakash> and don't see a way to pass filters to getIndexes
[09:28:49] <Derick> one sec
[09:30:48] <Derick> seems I was mistaken - there is no filter for listIndexes, only for listCollections:
[09:30:54] <Derick> https://docs.mongodb.com/manual/reference/command/listCollections/
[09:31:06] <Derick> amitprakash: also, what you done is the the way to run the listIndexes command
[09:31:26] <amitprakash> Derick, so I can't filter indexes at all?
[09:31:58] <Derick> you'd do that with: db.runCommand( { listIndexes: 'colname' } );
[09:32:37] <Derick> the .cursor you get back from that is not equivalent from a normal cursor though - because it's already executed on the server, you can't do any (server side) filtering on it. It is something you're going to have to do in your client/application.
[09:32:47] <amitprakash> Yay
[09:48:43] <gokl> joannac: Thank you.
[09:49:32] <gokl> Is it possible to get the hash value of a hashed index? I have a hashed index on _id (ObjectId) and would like to see the hash for a given document.
[10:22:07] <r0j4z0> hi there!
[10:22:19] <r0j4z0> one quick question im struggling with
[10:23:10] <r0j4z0> im running mongo --quiet --host ds2_silver/172.20.234.70:29017,172.20.234.79:29017,172.20.234.80:29017 keyon --eval="db.Consumers.find()"
[10:23:26] <r0j4z0> but im still getting NETWORK messages
[10:23:43] <r0j4z0> 2016-09-16T12:10:58.791+0200 I NETWORK starting new replica set monitor for replica set ds2_silver with seeds 172.20.234.70:29017,172.20.234.79:29017,172.20.234.80:29017
[10:23:43] <r0j4z0> 2016-09-16T12:10:58.792+0200 I NETWORK [ReplicaSetMonitorWatcher] starting
[10:23:55] <r0j4z0> is there any way to disable those messages too?
[12:16:36] <Zelest> :(
[12:20:04] <LowWalker> So I'm doing some automation to bring up a 2 slave 1 master cluster, I'm still not getting good responses when trying rs.conf() or rs.status()
[12:20:23] <LowWalker> Here's what I am seeing for those might who be bored enough to help :) - https://gist.github.com/lowwalker/ee8bd52a5b75aeb0373f0372ab73de46
[12:22:07] <joannac> what's the response when you run the rs.initiate(.....) command?
[12:24:08] <cheeser> replicaSetPrimar: please fix your connection. you've been bouncing in and out for over a day now.
[12:25:02] <joannac> ohai cheeser
[12:25:43] <cheeser> what up, yo?
[12:28:51] <Zelest> wazzaaa!
[12:30:45] <mpajor> how would I go on with creating an entry in an collection with a uuid? eg. { "userId": ObjectId() } ?
[12:31:47] <LowWalker> @joannac, gist updated. Added output when I try to manually execute the script
[12:33:52] <joannac> LowWalker: instead of running the script, can you just open a mongo shell, paste the rs.initiate() command?
[12:34:46] <joannac> lookslike both those errors are from the rs.add()
[12:38:26] <LowWalker> @joannac, "errmsg" : "'192.168.150.52:27017' has data already, cannot initiate set.",
[12:38:40] <LowWalker> that's when I execute just the rs.initiate(...) block
[12:38:43] <joannac> there you go, that's why the initiate is failing
[12:39:05] <LowWalker> Ok, lemme pull that step out of the automation. Bounce the vagrants and try again
[12:39:11] <joannac> um
[12:39:23] <joannac> you have to initiate the replica set
[12:39:40] <joannac> why do you have mongods that already have data on them?
[12:40:01] <LowWalker> I don't know lol
[12:40:30] <LowWalker> I'm just following the guide to deploy a replica set with keyfile from the docs, but I am automating it.
[12:41:27] <joannac> well, you're doing it in such a way that the mongod starts with data
[12:41:38] <joannac> which is wrong
[12:42:01] <Derick> uh
[12:42:09] <Derick> you just password protected the channel cheeser
[12:42:13] <cheeser> well, that didn't go as expected. :)
[12:42:42] <cheeser> there
[12:42:42] <joannac> :p
[12:43:00] <cheeser> tried to leave a message but apparently that's just chanserv akicks :)
[12:43:39] <joannac> LowWalker: you can only initiate a replica set if at most one node has data
[12:44:06] <joannac> and in that case, the node with data is the one you want to run rs.initiate() on
[12:44:20] <LowWalker> So these are vagrants, I'm blowing them away each time. I'm installing mongo-org* packages then running that script on the master node
[12:44:29] <LowWalker> To my knowledge there should be no data...
[12:44:41] <joannac> well, there is, somehow
[12:44:59] <LowWalker> wait, I add two users before trying the script.
[12:45:05] <joannac> try connecting to each of them and running show dbs, or db.stats() or something
[12:45:14] <joannac> yeah, that would do it
[12:45:59] <joannac> create vagrants, spin them up, add users on *one mongod only*, rs.initiate()
[12:47:13] <LowWalker> aaaahhh
[12:49:02] <LowWalker> @joannac, should I try with the rs.add() options in script or just the rs.initiate block?
[12:53:17] <LowWalker> Huzzah, you've made my day @joannac. rs.conf() shows all the members now.
[13:16:38] <joannac> LowWalker: :)
[14:03:40] <mw44118> Hey, I have a collection of users and a collection of purchases by user. I want to get all purchases for a given user's email address. I want to do something like db.purchases.find({user_id: db.users.find({email_address: "mork@example.com"}}) But that doesn't work. Is it possible to do a find inside another find?
[14:08:56] <mw44118> How do I do "x matches the c attributes of a.b.c"?
[14:24:55] <mw44118> Am I muted?
[15:52:16] <StephenLynx> kek
[16:37:24] <windsurf_> I'm having trouble getting the syntax right for this query. Just want to get multiple sums per document based on occurance of various values http://pastebin.com/Wd4siz2v
[16:40:15] <windsurf_> can anyone point out what I'm missing?
[16:42:46] <windsurf_> I'm just trying to count the number of occurrences of each value for a field
[16:43:37] <mw44118> windsurf_: so, like select id, count(*) from ... group by id;
[16:44:53] <windsurf_> mw44118 maybe... is that a SQL query or mongo?
[16:45:02] <windsurf_> looks like SQL
[16:45:07] <windsurf_> generally sounds right
[17:30:33] <mw44118> windsurf_: that was me making sure I understood your goal.
[17:32:13] <AndrewYoung> windsurf_: You can think of each thing you put in the _id field of the $group stage as an additional "group by" field in an equivalent SQL query.
[17:33:17] <AndrewYoung> The result you get would be the same.
[17:33:30] <AndrewYoung> Want you want is a second $group stage.
[17:33:38] <AndrewYoung> You want to group the results of the initial grouping.
[17:34:56] <AndrewYoung> Although having said that, it might be easier to do it in code.
[17:48:15] <windsurf_> AndrewYoung Thanks, how about not grouping by jobID but instead just including the jobID field?
[17:48:56] <windsurf_> I've been trying it and so far all I've come up with is jobID:{$first:'$jobID'} but I'm worried that might exclude some jobID values
[17:55:43] <windsurf_> hm. just tried adding a $match before $group. I'll need multiple queries but it will work, not as elegant though
[18:07:04] <AndrewYoung> Sorry, my IRC client took a dive
[18:07:14] <windsurf_> np
[18:07:23] <AndrewYoung> You can use $inc instead of $count I think
[18:07:31] <windsurf_> hm havne' tseen that one
[18:08:13] <windsurf_> I see it
[18:09:00] <AndrewYoung> That might not work in aggregation though, I'm not sure.
[18:10:52] <windsurf_> is it possible to do what I'm trying to do?
[18:25:12] <crazyphil> does mongoc need a keyfile like replica sets do when auth is turned on?
[19:32:06] <crazyphil> how would it be possible that a user with a role of readWrite, could write to some collections, but not others in the same DB?
[19:32:20] <StephenLynx> I think it is.
[19:32:32] <StephenLynx> you can grant permissions based on specific collections.
[19:32:37] <StephenLynx> if I am not mistaken
[19:32:43] <crazyphil> but I granted readWrite to the whole db
[19:32:48] <StephenLynx> hm
[19:32:51] <StephenLynx> I dunno then.
[19:32:55] <crazyphil> what would specifically exempt one single collection
[19:52:13] <_bahamas> Hi, is there any good free mongo client web interface ?
[19:52:36] <teprrr> mongoclient?
[19:52:40] <StephenLynx> >good
[19:52:41] <StephenLynx> >web
[19:52:44] <StephenLynx> >gui
[19:52:48] <StephenLynx> :^)
[19:53:36] <_bahamas> ok, i will try mongoclient
[19:53:37] <_bahamas> tks
[19:54:34] <teprrr> I'd like to know if there's some tool for doing initial data analysis, similar to kibana :P
[19:54:51] <teprrr> just to get a peek on what kind of values there are, to help building queries
[20:02:48] <crazyphil> aha, found it, forgot about a nodejs process using Winston to log to mongo
[20:03:47] <StephenLynx> kek
[20:27:26] <StephenLynx> weird
[20:27:31] <StephenLynx> anyone here familiar with the nodejs driver?
[20:27:42] <StephenLynx> I figured that I am using an undocumented function to delete documents.
[20:27:46] <StephenLynx> removeMany.
[20:27:51] <StephenLynx> the documented one is deleteMany
[20:50:41] <crazyphil> ok, how the heck do I create a user that can query on local.oplog.rs AND query on config.shards?
[20:51:04] <crazyphil> I've assigned multiple roles to a user, but it seems as if only the first one is seen
[20:54:58] <AlmightyOatmeal> is it possible to execute a lucene query within mongo or know of a pythonic method of parsing the lucene query to translate it into something mongodb friendly?
[20:55:19] <StephenLynx> lucene?
[20:58:25] <crazyphil> lucene is a text indexing system, most commonly found in Elasticsearch
[20:59:32] <AlmightyOatmeal> StephenLynx: yup, part of our elasticsearch backend; queries look like: (metric:if_octets.rx AND _missing_:programId) AND (tacocat:tacokitty)
[21:00:22] <crazyphil> AlmightyOatmeal: so you have the same data in ES and mongo? I'm not understanding the use case here
[21:02:07] <AlmightyOatmeal> crazyphil: correct but i'm using mongodb as a cache to perform data analytics on. i scroll through parts of ES, seamlessly dump it in mongodb, and go crazy on mongodb so i dont impact the production ES cluster.
[21:02:27] <AlmightyOatmeal> crazyphil: there is so much data in ES that performing analytics directly in ES could take weeks :\
[21:03:30] <crazyphil> sounds like you need a bigger ES cluster then
[21:03:40] <crazyphil> or you need to do some map-reducing
[21:04:51] <AlmightyOatmeal> crazyphil: oh don't get me started on the ES setup... unfortunately that's not my department although i have said "i told you so" to some of the ES architect team more than once
[21:06:30] <AlmightyOatmeal> crazyphil: some of the results contain a script that gets executed by another set of applications and part of that execution is an ES query -- that query is what i'm trying to parse out and turn into a mongodb query
[21:06:52] <AlmightyOatmeal> and so far pyparsing has been an absolute Rube Goldberg wet dream
[21:13:27] <crazyphil> ok, so you query ES with some query {}, and in its results a script is returned that other apps use?
[21:15:05] <crazyphil> that sounds totally nuts
[21:20:36] <crazyphil> for pete's sake it should not be this difficult to give a user rights across multiple db's
[21:29:22] <Doyle> Would there be any issue in doing an rs init, replicating in a new member (to get a new hardware profile), then doing rs remove, stopping mongo, and starting it without the RS config entry to get it in standalone mode? 3.0 mmapv1
[21:56:12] <crazyphil> ok, this is moronic, I assign a user multiple roles, however when I try to actually use BOTH roles assigned to the user, only the first one in the array works
[22:50:26] <Guest24> So I got a query that seems to take longer every time I go through it
[22:50:38] <Guest24> {"isLive": false, "eloComplete": nil}
[22:50:57] <Guest24> Right now I have an index on both eloComplete and isLive, what can I do to speed this up?
[22:51:11] <StephenLynx> did you tried explain()?
[23:13:42] <Guest24> running explain literally freezes my replica set
[23:13:43] <Guest24> xD
[23:15:45] <StephenLynx> ayyy
[23:16:13] <Guest24> think it was pulling all the rows
[23:16:18] <Guest24> works if I add limiti(100) on the end
[23:16:29] <Guest24> collection is like 1.5million records
[23:17:13] <StephenLynx> :^)
[23:18:31] <Guest24> so apparently that execution time is quick
[23:18:34] <Guest24> :/
[23:18:36] <Guest24> weird
[23:34:20] <Guest24> So the actual query in CLI takes 3-6 seconds
[23:34:23] <Guest24> but in mgo for golang it times out
[23:34:36] <Guest24> mgo is meant to be a pretty mature library though
[23:41:30] <cheeser> it's very mature
[23:41:39] <cheeser> all the mongo tools use it
[23:42:50] <Guest24> I wonder how I can debug this :/ I can't make the query any simpler
[23:43:31] <Guest24> I thought maybe it's processing from bytes to structs, but that wouldn't explain why it worked for the first 100k records and is slowing down slowly
[23:43:42] <Guest24> to the point where it now times out
[23:44:50] <Guest24> Please someone tell me if I'm being blind and doing something really silly here: https://gist.githubusercontent.com/jamieshepherd/8910666c6c56f8eab7f12197ab00dcdb/raw/e677df7caaf14bf9b829dd76cd8e6e26a4624790/query.go
[23:45:42] <Guest24> maybe it's the sort..
[23:45:42] <StephenLynx> dunno, never used the go driver.
[23:46:09] <StephenLynx> usually people are able to understand it better when you use the shell syntax.
[23:46:24] <StephenLynx> that sort is plain invalid under the shell syntax.
[23:46:36] <Guest24> well it's just equivalent to {endedAt: 1}
[23:46:45] <StephenLynx> just saying.
[23:47:00] <Guest24> I don't think it's all that difficult to understand to be honest
[23:47:01] <StephenLynx> why its important to use the shell syntax.
[23:47:04] <StephenLynx> kek
[23:47:20] <Guest24> but semantics aside, I think the sort is my bottleneck
[23:47:36] <StephenLynx> how many documents are being sorted by then?
[23:48:03] <StephenLynx> is there an index on that field?
[23:48:06] <Guest24> potentially about a million
[23:48:07] <Guest24> :D
[23:48:13] <Guest24> I guess?
[23:48:21] <StephenLynx> about the index?
[23:48:38] <Guest24> yeah, there's an index on all 3
[23:48:50] <Guest24> for all 3, not a compound on all 3
[23:48:51] <Guest24> endedAt, isLive, eloComplete
[23:48:59] <StephenLynx> no idea then.
[23:49:32] <Guest24> the strange thing is that it's slowed as I've gotten through them, and as I go through them I set eloComplete to false or true (thus not pulling them in again on a subsequent query)