PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 5th of June, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:07:31] <codenado> hmm quiet in here
[00:59:30] <codenado> hi all. I have a question about many-to-many relationships. I have users and groups, users can be in many groups, groups contain many users. What’s the preferred way to store that? One option is a person list within the group document, and a group list on the user document, but that means data is duplicated
[01:04:05] <joannac> how do you need to access your data
[01:04:25] <joannac> d you need to know who is in a group, as well as what groups someone is in, on a regular basis?
[01:04:54] <codenado> most of the load will be on reading what groups a user is in
[01:05:18] <codenado> less often, will be an admin tool that needs to add/remove users from groups
[01:05:50] <joannac> seems like both of those can be satisfied by having a list of groups in a use document
[01:07:08] <codenado> if i went that route, would the query to get all users in a group be efficient?
[01:08:00] <codenado> it seems like you’d be looping through every single user for that data, or is mongo efficient at that type of query?
[01:12:22] <joannac> you could index on the "groups" field, and then search db.users.find({groups: foobar})
[01:12:40] <joannac> but indexes on arrays can be problematic
[01:17:12] <codenado> thanks, I’ll take a look at multikey indexes
[02:29:08] <nezt> Hi all
[02:31:14] <nezt> i'm a graduate student, and i have to build a web application that basically serves as a formatted viewport for a mongoDB database. i was wondering what some of you recommend i use as the backend language/framework
[02:31:33] <nezt> node.js? python? c#? java?
[06:56:46] <voidhouse> Is there any mongodb-browser-sync sort of thing?
[06:57:03] <voidhouse> I wan't to sync a mongo document with the client (browser, at least for now) in real time? Something like firebase.
[07:11:16] <kali> voidhouse: mongodb is not designed to be accessed directly from a browser, apart except for monitoring and audit
[07:11:29] <kali> voidhouse: you will need some application layer
[07:11:56] <voidhouse> kali: Yeah, that is what I am looking for, something with Node.js, GoLang, or even Python. Or whatever really.
[07:14:21] <kali> voidhouse: ok. well, i have no idea :)
[07:15:49] <rspijker> you’re looking for a web-based shell basically voidhouse ?
[07:29:35] <mipo> hi
[07:30:16] <mipo> I checked mongodb official website , but i didn't find where is code description part?
[07:33:54] <noqqe> hi
[07:34:12] <noqqe> whats your preferred way to make your mongos instances highly available?
[07:34:24] <noqqe> having one on each app-server?
[07:34:46] <rspijker> that’s pretty much the default way noqqe, yeah
[07:35:38] <mipo> where is source code description in mongodb website?
[07:35:39] <kali> same here. that way, the app simply connects to localhost
[07:35:39] <noqqe> my problem is that our appservers are sles. there is no 2.6 on sles :/
[07:35:51] <noqqe> unless you have mongodb enterprise
[07:36:06] <kali> suse ? in 2014 ?
[07:36:32] <noqqe> company policy :/ cant do anything about that
[07:36:53] <kali> germany ?
[07:37:04] <noqqe> *chchch* yes
[07:38:26] <kali> noqqe: have you tried the "plain" linux binary ? the .tar.Gz ?
[07:38:46] <kali> noqqe: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-linux/ ?
[07:39:10] <noqqe> yes of course. it works but... you know. its 2014, as you said :)
[07:39:44] <noqqe> but i think its the only way i have now
[07:40:17] <noqqe> kali: do you have expierence using dns-roundrobin for (lets say 3) mongos instances ?
[07:40:35] <noqqe> is this a good idea?
[07:40:52] <rspijker> you can just have a connect string with multiple mongos instances in there, can;t you?
[07:41:07] <kali> rspijker: i'm not sure... does this work with mongos too ?
[07:41:07] <rspijker> and let the driver work it out
[07:41:25] <rspijker> kali: not 100% sure… I would have to ask our dev guys how we actually do it :P
[07:42:00] <kali> noqqe: i would not dare messsing with hostnames and dns and stuff around the mongodb discovery protocols
[07:42:41] <kali> noqqe: i know for sure it leads to all kind of weird (hear "bad") things in replica sets. maybe for mongos it's less awful, but i would not do it
[07:42:42] <noqqe> kali: i was afraid you say something like that
[07:43:31] <noqqe> okay. now im going to maintain the mongos binaries by myself. thank you guys :)
[07:44:47] <rspijker> or just run them on the DB servers… it’s slightly less optimal, but shouldn’t matter much in terms of reliability
[07:49:52] <noqqe> rspijker: and with the multi-host-connectstring thing?
[07:50:06] <noqqe> or where should i point the app to?
[07:52:20] <rspijker> noqqe: yes, it does assume that that actually works...
[07:52:36] <asido> is it possible?
[07:52:56] <rspijker> asido: you need to create a new field for that first, and then do the match
[07:53:06] <rspijker> I think you can do it with a projection
[07:53:22] <rspijker> so first project the sum to a new field, then match on that
[07:53:43] <asido> rspijker, so projection would execute here before each match?
[07:54:05] <rspijker> it would execute earlier in the pipeline
[07:55:48] <rspijker> db.games.aggregate([{$project:{“f”:{$add:[“$Attrint1”, “$Attrint2”]}}}, {$match: {“$f” : {$gt: 1, $lt: 2000}}])
[07:55:51] <rspijker> something like that
[07:56:32] <rspijker> probably my braces are horribly unmatched somewhere...
[09:06:22] <voidhouse> rspijker: No, something like firebase, to keep objects at client side (browser) and the database in sync.
[09:07:09] <rspijker> voidhouse: sounds like meteor
[09:07:33] <voidhouse> rspijker: Yeah, meteor is nice but it's a fullstack, I just want a simple API.
[09:08:22] <rspijker> don’t know of any of those
[09:08:30] <voidhouse> Perhaps I can dig through their code and see how it works, problem is though, it is using sock.js/socket.io
[09:09:36] <Nodex> voidhouse : what are you trying to achieve?
[09:10:16] <voidhouse> Nodex: realtime logs.
[09:10:47] <voidhouse> I have lots of processes that push their output to mongodb and I want a way to have the most recent version in the database.
[09:11:01] <voidhouse> The log is not simple line-based though.
[09:11:17] <Nodex> are they not timestamped?
[09:11:32] <voidhouse> A log consist of different parts which gets updated.
[09:11:57] <Nodex> add a "last_updated" key?
[09:12:35] <voidhouse> Nodex: Nope. Say I have `log { line_a: { text: "Download Progress" , value: 0.9 }; line_b: {....} }`
[09:12:59] <voidhouse> And the software updates log.line_{a,b}.value as it goes.
[09:13:47] <voidhouse> Nodex: Perhaps that is a way, but that means I will have to keep ask the server for possible changes.
[11:10:12] <jekle> I´m considering to replace mysql in our new webshop/cms because we need to define attributes dynamically for several entities within the system.
[11:10:58] <jekle> is this a common use case for mongodb? I have been reading articles and watching yt videos since days.and I am so unsure
[11:11:22] <asido> any ideas why I have such behavior when specifying variable in $projection: http://paste.kde.org/pxhf0fgje ?
[11:19:14] <Nodex> jekle : if your shop needs transactions then you shouldn't move the whole app to Mongodb
[11:19:27] <Nodex> nothing wrong with moving parts of it to mongo though
[11:25:30] <jekle> the apps audiences are quite small and we aren´t the greatest devs, in short: we haven´t used transactions yet :) but no problems so far. thus I think its maybe not so important for us. Thinking about using both I am concerned about too much complexity. managing two permanent related datastores sounds somehow like a challange
[11:25:34] <rainerfrey> I just setup a replica set for testing, with all nodes on the same hostt, and the logs of the mongods are full of entries like connection accepted from xx.xx.xx.xx, and end connection xx.xx.xx.xx
[11:26:33] <rainerfrey> each connection seems to live for 20-30 seconds. Is this normal that a replica set uses so short-lived connections among each other?
[11:27:14] <rainerfrey> and: is there a way to reduce that log output, as this seems to be quite unnecessary information
[11:28:03] <rainerfrey> (I read about configuration option "quiet", but that seems to suppress a lot more information and is knot recommended in the docs)
[11:30:13] <rainerfrey> I use mongdb 2.6.1 installed from debian package, and did not change any configuration options that are related to log level.
[11:39:23] <kali> rainerfrey: logging connection management is important, as connection can quickly become a scarse resource, so i would not recommend silencing this
[11:40:09] <kali> rainerfrey: as for the replica chatting, i think it is a relatively recent behaviour of 2.6, but i have seen it in working clusters, so i would not worry too much about it
[11:47:52] <rainerfrey> wow ... this is like 900-1000 log entries per hour, on an unused replica set. That is a multitude of the log volume of 6 application servers and micro service web applications together
[11:49:07] <kali> you're worrying about 1000 log lines per hour ? :)
[11:52:44] <rainerfrey> yes I do worry about 100 log lines per hour of just connection opened and connection closed events in an unused system. And I do worry on how to notice important information among this, especially as mongodb does not tag log entries with any kind of level or priority
[11:52:58] <rainerfrey> that"s 1000 of course
[11:54:53] <rainerfrey> BTW can anyone explain the connection (or difference) between the logLevel parameter and the verbosity configuration file option?
[13:35:48] <AlexejK> if my server is under heavy load (and for instance has huge IOWait %), when i have my write concern set to 1 (acknowledged) will that mean that operation will return as soon as a read for that object will not yield in an empty value?. Basically I'm having the problem that even though i have ACKNOWLEDGED set, and read pref to primary on this entity, when my server is overloaded my operation returns but next read for the same object fails (gets no
[13:35:48] <AlexejK> hit)
[13:37:41] <rspijker> AlexejK: that’s weird…
[13:39:05] <pd> Hi
[13:39:26] <pd> My mongo server seems to having very high loads on linux
[13:39:51] <pd> very high means the load is showing 1000
[13:40:00] <pd> CPU utiliztion looks fine
[13:40:09] <pd> mem utlization loooks fine
[13:40:18] <AlexejK> rspijker: I know.. that's why I'm wondering :-/ For me Acknowledged means that this server is ready to respond with it BUT may have not flushed down to disk yet
[13:40:30] <pd> and the disk is SSD that is also not over utilized..
[13:40:40] <rspijker> pd: what load is this? the load shown by top?
[13:40:45] <pd> yes by top
[13:40:52] <pd> it is showing 5000
[13:41:20] <rspijker> AlexejK: do you have journaling on that server/
[13:41:30] <AlexejK> yes
[13:41:49] <rspijker> is it sharded?
[13:41:56] <AlexejK> no not at the moment
[13:42:06] <rspijker> so you are talking directly to a mongod?
[13:42:13] <AlexejK> yes
[13:42:23] <AlexejK> via java driver
[13:42:26] <rspijker> through a driver or the shell?
[13:42:28] <rspijker> ah, ok
[13:43:07] <AlexejK> source claims that it's equal to w: 1, wtimeout: 0, j: false (i think)
[13:43:27] <rspijker> and if you wait for a few secs the read does return a hit?
[13:43:39] <AlexejK> yes
[13:43:42] <rspijker> pd: that’s a fairly high value to get for load… :s
[13:44:12] <AlexejK> that's whats confusing me and right now keeps me banging my head on the keyboard
[13:44:34] <rspijker> pd: so I’m guessing you have a fairly large wait queue then?
[13:44:55] <rspijker> AlexejK: and you are sure the java driver is executing things in the order you expect?
[13:47:27] <AlexejK> it's not async processing in that case.. its rest call that generates the object, saves it, when save operation (non-async) returns, it returns the key (kind of like _id) back to the client. Client in the next call uses the key to make another call, and when we process that call we check for existance of the object with that key in the DB.. that's were only under heavy load of mongo I'm getting no results, a but a sec or 2 later I can get a hit
[13:47:56] <pd> rspijker: sorry I had an internet issue
[13:48:05] <pd> Yes the load is as shown by top
[13:48:09] <AlexejK> "it returns the key" = "we return the key to the client, key is generated before save operation and is part of the object saved
[13:48:51] <rspijker> pd: what does vmstat show?
[13:49:11] <pd> checking now...
[13:50:14] <rspijker> AlexejK: ok… not 100% sure about the java driver. But I assume that it blocks until the write is acknowledged if you tell it to
[13:52:17] <AlexejK> rspijker: yes, that's what it says. It basically uses the {w:1, wtimeout:0, fsync: false, j:false } But w:1 should mean that server should return a hit on that one from that moment onward, right?
[13:54:28] <rspijker> yeah, assuming you are reading from the primary
[13:55:55] <rspijker> what is your read preference AlexejK ?
[13:57:02] <AlexejK> primary is a read-pref
[13:57:33] <rspijker> is it primary or primaryPreferred?
[13:57:46] <AlexejK> just primary specifically
[13:57:59] <AlexejK> but i'm tihnking if it's smarter to have primaryPrefered and w: 2 or something
[13:58:26] <AlexejK> that would mean longer response time on initial call but will offload primary.. but thats another topic :-)
[13:58:35] <AlexejK> right now it's primary (not primaryPrefered)
[14:00:15] <rspijker> then I really don’t understand
[14:00:58] <AlexejK> me neither, that's what's killing me :-(
[14:03:12] <rspijker> you are on 2.6 right?
[14:03:22] <Fishy> Is this the correct place to ask mongoose questions? Getting my pastebin ready...
[14:04:43] <rspijker> Fishy: there are usually some people around that use mongoose. So while it’s not the official place (not sure there is one), it’s often a good enough place to ask
[14:07:04] <AlexejK> rspijker: This *may* be Spring-Data issue that we use in some places.. I think it does some mumbo-jumbo with preparting it on some operations... will dig in more
[14:07:06] <AlexejK> and yes 2.6
[14:07:38] <rspijker> AlexejK: ok, all I can say is that afaict you are doing everything correctly from a mongo POV
[14:30:08] <AlexejK> rspijker: thanks, that at least eliminates that problem.. So i'll try to figure out if there is something else going on
[14:47:50] <michaelchum> I'm trying to insert over 40M documents with a script in PyMongo but the process slows down and stops at 3M documents
[14:48:27] <michaelchum> I found something weird in my mongod.log: [clientcursormon] mem (MB) res:1426 virt:13045 mapped:6094
[14:49:10] <michaelchum> The virtual memory (virt:) keeps increasing but stops at 13045, mapped stops at 6094. Everytime I try running the script
[14:49:33] <Derick> why would that not be normal?
[14:49:36] <michaelchum> Any ideas?
[14:51:19] <michaelchum> I mean, my documents stop inserting after 3M :(
[14:51:50] <Derick> are you swapping?
[14:51:57] <Derick> what does htop say, and mongotop?
[14:52:15] <michaelchum> And I tried to erase my db and run the script multiple again, I found the correlation of the virt: mem when the insertion stops
[14:52:52] <michaelchum> htop, doesn't use a lot of RAM only 1gb/4gb, CPU drops when insertion stops
[14:53:01] <michaelchum> I believe that I have no swap
[14:53:09] <michaelchum> in Ubuntu*
[14:53:23] <michaelchum> Haven't tried mongotop should I?
[15:14:42] <modcure> "Tracklist" : [ { "Track" : "1", "Title" : "Smells Like Teen Spirit", "Length" : "5:02" }, { "Track" : "2", "Title" : "In Bloom", "Length" : "4:15" } ] <-- i only want to get back the embeded document that contains "In Bloom" .. db.spreadsheets.find( { "Tracklist.Title" : "In Bloom" }, {Tracklist: 1} ) but its returning all the embedded documents within the array
[15:25:14] <sqlnoob> I'm trying to get number of unique users per month with logged in activity between two dates. more specifically month-wise. I tried db.runCommand() with distinct on ApplicationLog and userId as key with query for activityType and date field but I get 0 as the result. Can anyone think of or suggest me better approach for this? thanks
[15:25:40] <sqlnoob> here is the paste: http://pastie.org/private/jdk0gddacnjrz7htpwlma
[15:27:44] <sqlnoob> I feel like aggregation is a good way to go for it. trying to figure out using documentation but no success so far
[15:31:05] <rspijker> modcure: you can do “Traacklist.$”:1, but it will still give you a wrapping array, just with only 1 element
[15:34:33] <rspijker> sqlnoob: not sure what it is you are after exactly… Would the month be an input and the number of unique users during that month the output?
[15:36:04] <sqlnoob> exactly rspijker
[15:36:23] <sqlnoob> like: may: 24, april: 32
[15:37:35] <sqlnoob> I'm trying aggregate with $match: {date: {$lte: lowerDateValHere, $gte: higherDateValHeree}}
[15:37:48] <sqlnoob> not so good with mongodb and aggregation so having hard time
[15:37:52] <mistawright> hi guys i need some help. i need to clear my mogodb as I currently have filled this server up and did not realize it. I am however unable to run the mongod daemon as i am out of space. how can clear these db's so i can resume operation?
[15:38:35] <rspijker> sqlnoob: maybe do a group on the month, then addToSet for userIds and then project a count of that
[15:40:08] <sqlnoob> thanks rspijker. not sure what that exactly means but I'll work on it. need to do quite a reading. thanks
[15:40:12] <rspijker> sqlnoob: {$group: {“_id”:{$month:”$date”}, “uids”:{$addToSet: “$userId”}},{$project:{“logins”:{$size:”$uids”}}}
[15:40:16] <rspijker> something like that
[15:40:52] <rspijker> mistawright: just remove the files?
[15:41:32] <rspijker> in your dbpath there will be a bunch of files or directories (depending on your directoryPerDB setting) which you can simply remove
[15:42:19] <sqlnoob> rspijker: thanks again
[15:42:37] <rspijker> np, good luck :)
[15:42:42] <mistawright> rspijker, i have two files there that relate to graylog. do i just delete them and if that is the case does that remove the db as well? I need to continue to aggregate messages/logs
[15:43:27] <sqlnoob> much appreciated
[15:45:17] <modcure> rspijker, where do i place this in the query "Tracklist.$":1 ?
[15:45:57] <rspijker> modcure: that would be the projection
[15:46:15] <pd> rspijker: output of vmstat http://pastebin.com/tjRVZmUh
[15:46:23] <rspijker> you’re already doing a projection: “Tracklist”:1
[15:46:28] <rspijker> just add the .$ there
[15:46:34] <pd> top show 11000
[15:47:13] <rspijker> pd: that’s really strange, since you have very little processes runnable and uninterruptable sleep...
[15:47:26] <pd> yes I know
[15:47:44] <pd> mongo is taking lot of cpu 300%CPU
[15:47:52] <rspijker> how responsive is the system?
[15:48:16] <modcure> rspijker, that worked, thanks
[15:49:33] <pd> system is quite ok
[15:49:59] <rspijker> modcure: np
[15:50:06] <pd> in between it had given the fork error when ever I run any command
[15:50:13] <pd> so I increased ulimit to the max..
[15:50:18] <pd> but it still happens
[15:50:54] <rspijker> what is your connection count on mongod?
[15:51:20] <rspijker> pd: db.serverStatus().connections
[15:51:32] <pd> let me check
[15:51:49] <pd> -bash: fork: Cannot allocate memory
[15:51:53] <pd> this comes from time to time..
[15:52:01] <pd> when ever I run any command
[15:52:04] <pd> goes after some time
[15:52:21] <rspijker> that does sound an awful lot like your # processes is exhausted
[15:52:56] <rspijker> ulimit -u
[15:53:17] <rspijker> (mongod needs a thread for every connection, hence I’m hacing you check that)
[15:54:33] <pd> ulimit -u gives me 65535
[15:55:05] <rspijker> I’ve got to go
[15:55:06] <rspijker> good luck pd
[15:55:42] <pd> ok
[15:56:19] <pd> thanks
[18:01:15] <jzuijlek> hi
[18:01:31] <jzuijlek> need help resetting MMS two-factor authentication
[18:04:36] <jzuijlek> any support online?
[18:18:52] <BadHorsie> In an aggregate, how can I do something like {$group:{_id:"$field1"+"$field2","maxBlah"... to groug by two fields?
[18:40:26] <jzuijlek> db.users.aggregate(
[18:40:26] <jzuijlek> { $group: {
[18:40:26] <jzuijlek> _id: { group_name: "$group_name", status: "$status" },
[18:40:26] <jzuijlek> 'total_sum': { $sum: 1 }
[18:40:26] <jzuijlek> }}
[18:40:27] <jzuijlek> )
[18:45:09] <michaelchum> Hi, I'm trying to insert 40M documents but at some point, the insert pauses every 5 minute and then stops eventually stops at 3M documents. I've also noticed that mongotop always says 0 in everything during insert, there's 0ms in write and read
[18:45:36] <michaelchum> Any ideas?
[18:53:51] <BadHorsie> Thanks jzuijlek!
[18:54:41] <jzuijlek> any MMS support online?
[18:57:32] <jzuijlek> What driver/language are you using michaelchum?
[18:57:47] <michaelchum> pymongo!
[18:58:44] <jzuijlek> hmm, don't have any experience with using mongodb with python.
[18:59:10] <jzuijlek> Isn't the python script running into a timeout?
[18:59:54] <michaelchum> I don't think so well, it's just a big loop which iterates and every iteration computes a document and inserts it
[19:00:29] <michaelchum> But do you think it's normal that mongotop always says 0 while the db.collection.count() increases?
[19:00:59] <michaelchum> The insertion always stop at 3M documents
[19:02:23] <amcrn> general question: is there a downside to defaulting to a sharded deployment with a single shard (a replica-set of three members) vs. using a vanilla replica-set with three members? if you know you're going to have some deployments that absolutely need shards, why not just default to a sharded strategy for all to simplify operational complexity (assuming you're willing to eat the cost of config-servers and query routers)
[19:03:35] <jzuijlek> mongostat could be more helpfull in your case
[19:04:56] <michaelchum> oh ok, I do have 500 inserts per second, going to wait until 3M and see what happens, thanks jzuijlek!
[19:05:19] <jzuijlek> Good luck
[19:12:53] <amcrn> (i found a mailing list entry @ https://groups.google.com/forum/#!msg/mongodb-user/FyYw2jczyHA/PIkvU8fpwFAJ on 11/20/13 seeming to indicate it's a reasonable idea, but was curious if anything has changed)
[19:26:17] <biggbear> hello folks. iterating over a collection is sending a lot of the same object. I have a lot of results identicaly like this { "_id" : { "$oid" : "4ffb8b1724acb9a87fbcd4b5"}...
[19:26:57] <biggbear> and this log: info DFM::findAll(): extent 2:3d48000 was empty, skipping ahead.
[19:34:53] <insanidade> hi all. what access permission/role should I fix in order to solve the following error when trying to connect to mongodb: Error while trying to show server startup warnings: not authorized on admin to execute command { getLog: "startupWarnings" }
[19:35:06] <insanidade> any hints ?
[20:04:21] <BlakeRG> can anyone tell me why i get this error? ReferenceError: birthday is not defined
[20:04:31] <BlakeRG> from an inline command
[20:04:34] <BlakeRG> mongo 10.4.15.243/mydb --eval 'db.user_data.distinct('birthday');'
[20:22:38] <BadHorsie> Probably the quotes... You are quoting the outside of the statement with '
[20:24:30] <BadHorsie> Try "db.user_data.distinct('birthday');"
[20:24:39] <BadHorsie> Hmm am I talking alone again? hah.
[20:31:19] <biggbear> what could be the reason for getting lots of documents whit the same "_id"? (info DFM::findAll(): extent 2:3d48000 was empty, skipping ahead.)
[20:31:51] <cheeser> sharded?
[20:32:21] <biggbear> sharded?
[20:33:31] <biggbear> not sharded
[22:21:39] <biggbear> List<DBObject> list=collection.findAll()..toArray(); iterating over this list i'm getting a lot of exactly the same documents (same "_id"). Driven me crazy
[22:41:26] <biggbear> strange is if i search for all of those documents i get only one
[23:05:44] <proteneer> The storageSize does not decrease as you remove or shrink documents.
[23:05:47] <proteneer> what?
[23:06:00] <proteneer> you guys wrote a monotonically increasing database?
[23:09:58] <cheeser> no, diskspace is not released back to the OS